Skip to content

TimescaleDB Performance Optimization

This guide covers performance optimization strategies for TimescaleDB in the PRS production environment.

Overview

TimescaleDB provides significant performance improvements for time-series data through: - Chunk-based partitioning - Automatic time-based data partitioning - Query optimization - Intelligent chunk exclusion for faster queries - Compression - Reduced I/O through data compression - Parallel processing - Multi-core query execution

Performance Benefits in PRS

Query Performance Improvements

Query Type Before TimescaleDB After TimescaleDB Improvement
Time-range queries Full table scan Chunk exclusion 10-100x faster
Aggregations Slow on large tables Optimized chunks 5-50x faster
Recent data queries Index-dependent Chunk-optimized 3-20x faster
Analytics queries Memory-intensive Compressed data 2-10x faster

Storage Efficiency

Data Age Compression Ratio Storage Savings Query Impact
< 30 days Uncompressed 0% Fastest queries
30-90 days 70-80% High savings Fast queries
90+ days 80-90% Maximum savings Good performance

Optimization Strategies

1. Query Optimization

Time-based Query Patterns

SQL
-- OPTIMIZED: Use time-based WHERE clauses
SELECT COUNT(*) FROM audit_logs
WHERE created_at >= NOW() - INTERVAL '30 days';

-- OPTIMIZED: Use time_bucket for aggregations
SELECT
    time_bucket('1 hour', created_at) as hour,
    COUNT(*) as events
FROM audit_logs
WHERE created_at >= NOW() - INTERVAL '24 hours'
GROUP BY hour
ORDER BY hour;

-- OPTIMIZED: Combine time and other filters
SELECT * FROM requisitions
WHERE created_at >= '2024-01-01'
  AND created_at < '2024-02-01'
  AND status = 'approved';

Avoid Anti-patterns

SQL
-- AVOID: Queries without time constraints
SELECT COUNT(*) FROM audit_logs;  -- Scans all chunks

-- AVOID: Functions on time columns in WHERE
SELECT * FROM audit_logs
WHERE DATE(created_at) = '2024-01-01';  -- Prevents chunk exclusion

-- BETTER: Use range queries
SELECT * FROM audit_logs
WHERE created_at >= '2024-01-01'
  AND created_at < '2024-01-02';

2. Index Optimization

SQL
-- Time-based indexes (usually automatic)
CREATE INDEX IF NOT EXISTS idx_audit_logs_created_at
ON audit_logs (created_at DESC);

-- Composite indexes for common queries
CREATE INDEX IF NOT EXISTS idx_audit_logs_user_time
ON audit_logs (user_id, created_at DESC);

-- Partial indexes for specific conditions
CREATE INDEX IF NOT EXISTS idx_requisitions_pending
ON requisitions (created_at)
WHERE status = 'pending';

Index Maintenance

Bash
# Check index usage
docker exec -e PGPASSWORD="${POSTGRES_PASSWORD}" prs-ec2-postgres-timescale \
    psql -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" -c "
SELECT
    schemaname,
    tablename,
    indexname,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_tup_read > 0
ORDER BY idx_tup_read DESC;"

# Rebuild indexes if needed (for small tables only)
REINDEX INDEX CONCURRENTLY idx_name;

3. Compression Optimization

Compression Configuration

SQL
1
2
3
4
5
6
7
8
9
-- Enable compression on hypertable
ALTER TABLE audit_logs SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'user_id',
    timescaledb.compress_orderby = 'created_at DESC'
);

-- Add compression policy
SELECT add_compression_policy('audit_logs', INTERVAL '30 days');

Compression Best Practices

  • Segment by high-cardinality columns - Group similar data together
  • Order by time columns - Optimize for time-based queries
  • Compress older data - Balance query performance with storage savings

4. Chunk Management

Optimal Chunk Sizing

SQL
1
2
3
4
5
6
7
8
-- Check current chunk intervals
SELECT
    hypertable_name,
    chunk_interval
FROM timescaledb_information.dimensions;

-- Adjust chunk interval if needed (for new data)
SELECT set_chunk_time_interval('audit_logs', INTERVAL '1 week');

Chunk Monitoring

SQL
-- Monitor chunk sizes
SELECT
    chunk_name,
    pg_size_pretty(pg_total_relation_size(chunk_name)) as size,
    range_start,
    range_end
FROM timescaledb_information.chunks
WHERE hypertable_name = 'audit_logs'
ORDER BY range_start DESC
LIMIT 10;

Performance Maintenance

1. Regular Optimization Tasks

Bash
1
2
3
4
5
6
7
8
# Run comprehensive optimization
./scripts/deploy-ec2.sh timescaledb-optimize

# Advanced maintenance cycle
./scripts/deploy-ec2.sh timescaledb-maintenance full-maintenance

# Manual compression of eligible chunks
./scripts/deploy-ec2.sh timescaledb-maintenance compress

2. Performance Monitoring

Query Performance Analysis

SQL
-- Enable pg_stat_statements for query monitoring
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slow queries
SELECT
    query,
    calls,
    total_time,
    mean_time,
    rows
FROM pg_stat_statements
WHERE mean_time > 1000  -- Queries taking more than 1 second
ORDER BY mean_time DESC
LIMIT 10;

-- Reset statistics
SELECT pg_stat_statements_reset();

Resource Usage Monitoring

Bash
# Monitor database performance
./scripts/deploy-ec2.sh monitor

# Check connection usage
docker exec -e PGPASSWORD="${POSTGRES_PASSWORD}" prs-ec2-postgres-timescale \
    psql -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" -c "
SELECT
    count(*) as total_connections,
    count(*) FILTER (WHERE state = 'active') as active,
    count(*) FILTER (WHERE state = 'idle') as idle
FROM pg_stat_activity;"

3. Automated Performance Optimization

Cron Job for Regular Optimization

Bash
# Add to crontab for weekly optimization
0 3 * * 0 cd /home/ubuntu/prs-prod/prs-production-deployment/ec2-graviton-setup && ./scripts/deploy-ec2.sh timescaledb-optimize >> /var/log/timescaledb-optimize.log 2>&1

Application-Level Optimizations

1. Query Patterns for Developers

Efficient Time-Range Queries

JavaScript
// GOOD: Use time-based filters
const recentAudits = await AuditLog.findAll({
  where: {
    created_at: {
      [Op.gte]: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) // Last 30 days
    }
  },
  order: [['created_at', 'DESC']],
  limit: 100
});

// GOOD: Use time buckets for aggregations
const hourlyStats = await sequelize.query(`
  SELECT
    time_bucket('1 hour', created_at) as hour,
    COUNT(*) as count
  FROM audit_logs
  WHERE created_at >= NOW() - INTERVAL '24 hours'
  GROUP BY hour
  ORDER BY hour
`);

Pagination Best Practices

JavaScript
// GOOD: Use time-based pagination
const getAuditLogs = async (beforeTime, limit = 50) => {
  return await AuditLog.findAll({
    where: {
      created_at: {
        [Op.lt]: beforeTime
      }
    },
    order: [['created_at', 'DESC']],
    limit
  });
};

// AVOID: OFFSET-based pagination on large tables
// This gets slower as offset increases
const badPagination = await AuditLog.findAll({
  offset: page * limit,  // Slow for large offsets
  limit
});

2. Connection Pool Optimization

Sequelize Configuration

JavaScript
// Optimize connection pool for TimescaleDB
const sequelize = new Sequelize(database, username, password, {
  host: 'localhost',
  dialect: 'postgres',
  pool: {
    max: 5,          // Maximum connections (EC2 t4g.medium limit)
    min: 1,          // Minimum connections
    acquire: 30000,  // Maximum time to get connection
    idle: 10000,     // Maximum idle time
    evict: 20000     // Time to evict idle connections
  },
  logging: false,    // Disable query logging in production
  dialectOptions: {
    ssl: process.env.NODE_ENV === 'production' ? {
      require: false,
      rejectUnauthorized: false
    } : false
  }
});

3. Caching Strategies

Application-Level Caching

JavaScript
// Cache frequently accessed data
const NodeCache = require('node-cache');
const cache = new NodeCache({ stdTTL: 300 }); // 5 minutes

const getCachedStats = async () => {
  const cacheKey = 'dashboard_stats';
  let stats = cache.get(cacheKey);

  if (!stats) {
    stats = await sequelize.query(`
      SELECT
        COUNT(*) as total_requisitions,
        COUNT(*) FILTER (WHERE status = 'pending') as pending,
        COUNT(*) FILTER (WHERE created_at >= NOW() - INTERVAL '24 hours') as today
      FROM requisitions
    `);
    cache.set(cacheKey, stats);
  }

  return stats;
};

Performance Troubleshooting

1. Slow Query Diagnosis

Identify Slow Queries

SQL
-- Check currently running queries
SELECT
    pid,
    now() - pg_stat_activity.query_start AS duration,
    query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';

-- Analyze query execution plan
EXPLAIN (ANALYZE, BUFFERS)
SELECT COUNT(*) FROM audit_logs
WHERE created_at >= NOW() - INTERVAL '30 days';

Query Optimization Steps

  1. Add time constraints - Ensure queries filter by time
  2. Check chunk exclusion - Verify chunks are being excluded
  3. Review indexes - Ensure appropriate indexes exist
  4. Consider compression - Check if data should be compressed

2. Resource Bottlenecks

Memory Issues

Bash
1
2
3
4
5
6
7
8
9
# Check memory usage
free -h
docker stats prs-ec2-postgres-timescale --no-stream

# Adjust PostgreSQL memory settings if needed
# In .env file:
POSTGRES_SHARED_BUFFERS=128MB
POSTGRES_EFFECTIVE_CACHE_SIZE=512MB
POSTGRES_WORK_MEM=4MB

I/O Bottlenecks

Bash
1
2
3
4
5
# Monitor disk I/O
iostat -x 1 5

# Check for compression opportunities
./scripts/deploy-ec2.sh timescaledb-maintenance status

3. Connection Issues

Too Many Connections

SQL
-- Check connection limits
SELECT
    setting as max_connections,
    (SELECT count(*) FROM pg_stat_activity) as current_connections
FROM pg_settings
WHERE name = 'max_connections';

-- Kill long-running connections if needed
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
  AND now() - state_change > interval '1 hour';

Performance Benchmarking

1. Baseline Performance Tests

SQL
-- Test time-range query performance
\timing on
SELECT COUNT(*) FROM audit_logs
WHERE created_at >= NOW() - INTERVAL '30 days';

-- Test aggregation performance
SELECT
    time_bucket('1 day', created_at) as day,
    COUNT(*) as events
FROM audit_logs
WHERE created_at >= NOW() - INTERVAL '90 days'
GROUP BY day
ORDER BY day;

-- Test join performance
SELECT
    r.id,
    r.title,
    COUNT(a.id) as audit_count
FROM requisitions r
LEFT JOIN audit_logs a ON a.requisition_id = r.id
WHERE r.created_at >= NOW() - INTERVAL '30 days'
GROUP BY r.id, r.title
ORDER BY audit_count DESC
LIMIT 10;

2. Performance Metrics Tracking

Bash
# Create performance tracking script
#!/bin/bash
echo "=== TimescaleDB Performance Report ==="
echo "Date: $(date)"

# Query performance
echo "Recent query performance:"
docker exec -e PGPASSWORD="${POSTGRES_PASSWORD}" prs-ec2-postgres-timescale \
    psql -U "${POSTGRES_USER}" -d "${POSTGRES_DB}" -c "
SELECT
    ROUND(mean_time, 2) as avg_time_ms,
    calls,
    LEFT(query, 50) as query_sample
FROM pg_stat_statements
WHERE calls > 10
ORDER BY mean_time DESC
LIMIT 5;"

# Compression status
echo "Compression effectiveness:"
./scripts/deploy-ec2.sh timescaledb-maintenance status | grep compression_ratio

Optimized TimescaleDB performance ensures your PRS system can handle growing data volumes while maintaining fast response times!