devops, networking, tutorial

Comprehensive DNS Guide

A complete guide to DNS concepts, troubleshooting, and best practices.

dns devops sre networking troubleshooting infrastructure

Comprehensive DNS Guide

DNS (Domain Name System) is one of the most fundamental technologies in modern internet infrastructure. As a DevOps or SRE engineer, understanding DNS is crucial for troubleshooting, performance optimization, and system reliability. This comprehensive guide covers all DNS concepts you need to know for real-world scenarios and technical implementations.

Table of Contents

  1. DNS Fundamentals
  2. DNS Record Types
  3. DNS Resolution Process
  4. DNS Caching Mechanisms
  5. DNS Security
  6. Performance Optimization
  7. Troubleshooting DNS Issues
  8. Cloud DNS Services
  9. Monitoring and Alerting
  10. Common DNS Scenarios and Solutions

DNS Fundamentals

What is DNS?

DNS is a hierarchical, distributed naming system that translates human-readable domain names (like hanyouqing.com) into IP addresses (like 104.21.25.158) that computers use to identify each other on the network.

Key Components

  • Root Servers: 13 root name servers (A-M) that know about top-level domains
  • TLD Servers: Top-level domain servers ( .com, .org, .net, etc.)
  • Authoritative Name Servers: Servers that contain the actual DNS records for domains
  • Recursive Resolvers: DNS servers that perform queries on behalf of clients
  • Stub Resolvers: Client-side DNS resolvers (usually in the OS)

Root Servers

A root server is a fundamental component of the Domain Name System (DNS) that acts as the entry point to the internet’s hierarchy of domain names. These servers contain a global directory of Top-Level Domains (TLDs), such as .com or .org, and direct queries for specific domain names to the appropriate TLD name servers, which then lead to the IP address needed to access a website or online service. There are 13 logical root server addresses worldwide, but hundreds of physical root servers are distributed globally to provide redundancy and stability to the internet.

TLD Servers

Top-level domain (TLD) servers hold the IP addresses for all domain names ending with a specific TLD, such as “.com” or “.org”. They are a crucial part of the Domain Name System (DNS), acting as the first stop for DNS resolvers to direct a browser’s request for a website address. TLD servers guide the request to the authoritative name server for that particular domain, completing the process of finding the correct IP address for the requested website.

Authoritative Name Servers

Authoritative name servers provide definitive answers for specific DNS domains, holding the original, authoritative data for a zone and responding to queries about hostnames and IP addresses within that domain, unlike recursive servers which cache information. There are two types of authoritative servers: a primary server, which stores the original zone records, and one or more secondary servers, which receive updates from the primary server to create an identical copy for redundancy and load balancing.

Recursive Resolvers

A recursive resolver (or DNS recursor) is a server that acts as a middleman in a DNS query, taking a domain name from a client and performing the entire lookup process on the client’s behalf. It queries root, Top-Level Domain (TLD), and authoritative nameservers to find the corresponding IP address and then returns it to the client. Recursive resolvers use caching to store frequently requested information, allowing for faster responses to subsequent queries for the same domain.

Stub Resolvers

A stub resolver is a basic Domain Name System (DNS) client on a user’s device that forwards DNS queries to a more capable recursive resolver for resolution, instead of handling the full process itself. It functions as an intermediary, sending a request from an application (like a web browser) to a recursive server, which then performs the necessary lookups to find the corresponding IP address. Stub resolvers are “stubs” or partial resolvers because they rely on other servers to do the heavy lifting of finding DNS records.

DNS Hierarchy

Root (.)
├── TLD (.com, .org, .net)
│   └── Domain (hanyouqing.com)
│       ├── Subdomain (hanyouqing.com)
│       ├── Subdomain (api.hanyouqing.com)
│       └── Subdomain (mail.hanyouqing.com)

DNS Record Types

Essential Record Types

A Record

  • Purpose: Maps domain name to IPv4 address
  • Example: hanyouqing.com. IN A 104.21.25.158
  • TTL: Typically 300-3600 seconds

AAAA Record

  • Purpose: Maps domain name to IPv6 address
  • Example: hanyouqing.com. IN AAAA 2001:db8::1
  • TTL: Typically 300-3600 seconds

CNAME Record

  • Purpose: Creates alias for another domain name
  • Example: blog.hanyouqing.com. IN CNAME hanyouqing.com.
  • Restrictions: Cannot be used for root domain, cannot coexist with other records

MX Record

  • Purpose: Specifies mail exchange servers
  • Example: hanyouqing.com. IN MX 10 mail.hanyouqing.com.
  • Priority: Lower numbers = higher priority

NS Record

  • Purpose: Delegates subdomain to other name servers
  • Example: subdomain.hanyouqing.com. IN NS ns1.subdomain.hanyouqing.com.

TXT Record

  • Purpose: Stores text information (SPF, DKIM, DMARC, etc.)
  • Example: hanyouqing.com. IN TXT "v=spf1 include:_spf.google.com ~all"

PTR Record

  • Purpose: Reverse DNS lookup (IP to domain)
  • Example: 1.2.0.192.in-addr.arpa. IN PTR hanyouqing.com.

SOA Record

  • Purpose: Start of Authority - contains administrative information
  • Components: Primary name server, admin email, serial number, refresh interval

Advanced Record Types

SRV Record

  • Purpose: Specifies location of services
  • Example: _http._tcp.hanyouqing.com. IN SRV 10 5 80 hanyouqing.com.

CAA Record

  • Purpose: Certificate Authority Authorization
  • Example: hanyouqing.com. IN CAA 0 issue "letsencrypt.org"

DNS Resolution Process

Recursive vs Iterative Queries

Recursive Query

  • Client asks resolver to find the answer
  • Resolver performs all necessary queries
  • Returns final answer to client

Iterative Query

  • Resolver asks authoritative server
  • Server returns best answer it can provide
  • Resolver continues querying if needed

Complete Resolution Flow

The following is a diagram of the resolution workflow:

Mermaid Diagram
Rendering diagram...

The following is a sequence diagram of the workflow:

Mermaid Diagram
Rendering diagram...

Resolution Steps

  1. Client Query: Application requests domain resolution
  2. Local Cache Check: Stub resolver checks local cache
  3. Recursive Resolver: If not cached, query recursive resolver
  4. Root Server Query: Recursive resolver queries root servers
  5. TLD Query: Query appropriate TLD server
  6. Authoritative Query: Query domain’s authoritative name servers
  7. Response: Return IP address to client
  8. Caching: Store result in cache for future use

DNS Caching Mechanisms

TTL (Time To Live)

TTL determines how long DNS records can be cached:

  • Short TTL (300-600s): Fast changes, high availability
  • Medium TTL (3600s): Balanced approach
  • Long TTL (86400s+): Stable records, reduced load

Caching Layers

1. Application Cache

  • Browser Cache: Chrome (1 minute), Firefox (60 seconds)
  • Application Cache: Custom TTL, varies by implementation

2. Operating System Cache

  • Windows: DNS Client service
  • Linux: systemd-resolved, nscd
  • macOS: mDNSResponder

3. Recursive Resolver Cache

  • ISP Resolvers: Typically 24-48 hours
  • Public Resolvers: Google (8.8.8.8), Cloudflare (1.1.1.1)

4. Authoritative Server Cache

  • Zone File: Static records
  • Dynamic Updates: Real-time changes

Cache Poisoning Prevention

  • Random Transaction IDs: Prevent prediction attacks
  • Source Port Randomization: Additional entropy
  • DNSSEC: Cryptographic validation

DNS Security

Common DNS Attacks

1. DNS Spoofing/Cache Poisoning

  • Attack: Injecting false DNS records
  • Prevention: DNSSEC, random transaction IDs

2. DNS Amplification

  • Attack: Using DNS servers to amplify DDoS attacks
  • Prevention: Rate limiting, response size limits

3. DNS Tunneling

  • Attack: Exfiltrating data through DNS queries
  • Prevention: DNS filtering, monitoring

4. Domain Hijacking

  • Attack: Unauthorized changes to DNS records
  • Prevention: Registrar locks, multi-factor authentication

DNSSEC (DNS Security Extensions)

DNSSEC provides:

  • Data Integrity: Ensures records haven’t been modified
  • Authentication: Verifies record authenticity
  • Non-repudiation: Prevents denial of record existence

DNSSEC Record Types

  • RRSIG: Digital signature for record sets
  • DNSKEY: Public key for verification
  • DS: Delegation signer record
  • NSEC/NSEC3: Proof of non-existence

DNS over HTTPS (DoH) and DNS over TLS (DoT)

DoH (DNS over HTTPS)

  • Port: 443 (HTTPS)
  • Encryption: TLS 1.3
  • Privacy: Hides DNS queries from ISP

DoT (DNS over TLS)

  • Port: 853
  • Encryption: TLS 1.3
  • Performance: Lower overhead than DoH

Performance Optimization

DNS Performance Metrics

Query Response Time

  • Excellent: < 50ms
  • Good: 50-100ms
  • Acceptable: 100-200ms
  • Poor: > 200ms

Cache Hit Ratio

  • Target: > 90%
  • Measurement: Cached queries / Total queries

Optimization Strategies

1. TTL Optimization

# Short TTL for critical services
app.hanyouqing.com.    300  IN  A  104.21.25.158

# Long TTL for stable services
hanyouqing.com.    3600 IN  A  104.21.25.158

2. DNS Load Balancing

# Round-robin A records
hanyouqing.com.    300  IN  A  104.21.25.158
hanyouqing.com.    300  IN  A  104.21.25.159
hanyouqing.com.    300  IN  A  104.21.25.160

3. Geographic DNS

# US East Coast
us-east.hanyouqing.com. 300 IN  A  104.21.25.158

# US West Coast  
us-west.hanyouqing.com. 300 IN  A  104.21.25.159

# Europe
eu.hanyouqing.com.      300 IN  A  104.21.25.160

4. CDN Integration

# CloudFlare
hanyouqing.com.    300 IN  CNAME  hanyouqing.com.cdn.cloudflare.net.

# AWS CloudFront
hanyouqing.com.    300 IN  CNAME  d1234567890.cloudfront.net.

Monitoring DNS Performance

Key Metrics

  • Query Latency: Response time per query
  • Success Rate: Percentage of successful queries
  • Cache Hit Ratio: Cached vs uncached queries
  • Error Rate: Failed queries percentage

Tools

  • dig: Command-line DNS lookup
  • nslookup: Interactive DNS query tool
  • host: Simple DNS lookup utility
  • DNS monitoring services: Pingdom, UptimeRobot

Troubleshooting DNS Issues

Common DNS Problems

1. DNS Resolution Failures

# Check if DNS is working
dig @8.8.8.8 google.com

# Check specific record type
dig A hanyouqing.com
dig MX hanyouqing.com
dig TXT hanyouqing.com

2. Slow DNS Resolution

# Measure query time
dig +stats hanyouqing.com

# Check different resolvers
dig @8.8.8.8 hanyouqing.com
dig @1.1.1.1 hanyouqing.com

3. DNS Propagation Issues

# Check from multiple locations
dig @8.8.8.8 hanyouqing.com
dig @1.1.1.1 hanyouqing.com
dig @208.67.222.222 hanyouqing.com

Troubleshooting Commands

Basic DNS Queries

# Standard query
dig hanyouqing.com

# Specific record type
dig A hanyouqing.com
dig MX hanyouqing.com
dig TXT hanyouqing.com

# Reverse DNS lookup
dig -x 192.0.2.1

# Trace resolution path
dig +trace hanyouqing.com

Advanced Troubleshooting

# Check authoritative servers
dig NS hanyouqing.com

# Check SOA record
dig SOA hanyouqing.com

# Check all records
dig ANY hanyouqing.com

# Verbose output
dig +trace +all hanyouqing.com

DNS Cache Management

# Flush DNS cache (Linux)
sudo systemctl flush-dns
sudo systemctl restart systemd-resolved

# Flush DNS cache (macOS)
sudo dscacheutil -flushcache
sudo killall -HUP mDNSResponder

# Flush DNS cache (Windows)
ipconfig /flushdns

Debugging Tools

1. dig (Domain Information Groper)

# Basic usage
dig @server  hanyouqing.com

# Query specific record type
dig A hanyouqing.com

# Trace resolution
dig +trace hanyouqing.com

# Show statistics
dig +stats hanyouqing.com

2. nslookup

# Interactive mode
nslookup
> set type=MX
> hanyouqing.com

# Command line mode
nslookup hanyouqing.com 8.8.8.8

3. host

# Simple lookup
host hanyouqing.com

# Reverse lookup
host 192.0.2.1

# Verbose output
host -v hanyouqing.com

Cloud DNS Services

AWS Route 53

Features

  • High Availability: 100% SLA
  • Global Anycast: 200+ edge locations
  • Health Checks: Monitor endpoint health
  • Traffic Policies: Advanced routing

Record Types

  • A/AAAA: IPv4/IPv6 addresses
  • CNAME: Aliases
  • MX: Mail exchange
  • TXT: Text records
  • SRV: Service records
  • PTR: Reverse DNS

Pricing

  • Hosted Zones: $0.50/month
  • Queries: $0.40 per million queries
  • Health Checks: $0.50/month per check

Google Cloud DNS

Features

  • Global Anycast: Fast resolution worldwide
  • Private Zones: Internal DNS resolution
  • DNSSEC: Security extensions
  • Monitoring: Cloud Monitoring integration

Pricing

  • Managed Zones: $0.20/month
  • Queries: $0.40 per million queries
  • Private Zones: $0.20/month

Azure DNS

Features

  • High Performance: Fast resolution
  • Private DNS: Internal resolution
  • Alias Records: Point to Azure resources
  • Traffic Manager: Load balancing

Pricing

  • Hosted Zones: $0.50/month
  • Queries: $0.40 per million queries

Cloudflare DNS

Features

  • Free Tier: Basic DNS service
  • Global Anycast: 200+ cities
  • DNSSEC: Security extensions
  • Analytics: Query analytics

Pricing

  • Free: Basic DNS
  • Pro: $20/month
  • Business: $200/month
  • Enterprise: Custom pricing

Monitoring and Alerting

Key Metrics to Monitor

1. Availability Metrics

  • DNS Resolution Success Rate: > 99.9%
  • Query Response Time: < 100ms
  • Uptime: 99.99%+
  • Server Availability: Individual DNS server status
  • Zone Transfer Success: Secondary DNS synchronization

2. Performance Metrics

  • Average Query Time: Track over time
  • Peak Query Volume: Identify patterns
  • Cache Hit Ratio: > 90%
  • Query Rate: Queries per second (QPS)
  • Response Size: Monitor for large responses
  • Recursive Query Depth: Number of hops to resolve

3. Security Metrics

  • DNSSEC Validation: 100% for signed zones
  • Suspicious Queries: Monitor for attacks
  • Failed Queries: Track error patterns
  • DDoS Attack Detection: Unusual query patterns
  • Malicious Domain Blocking: Security filter effectiveness

4. Business Metrics

  • Geographic Distribution: Query origins by region
  • Top Queried Domains: Most requested domains
  • Query Types Distribution: A, AAAA, MX, etc.
  • Peak Usage Times: Traffic patterns
  • Error Rate by Query Type: Identify problematic record types

5. Infrastructure Metrics

  • CPU Usage: DNS server resource utilization
  • Memory Usage: Cache and process memory
  • Network I/O: Bandwidth utilization
  • Disk I/O: Log and zone file operations
  • Connection Count: Active connections to DNS servers

Monitoring Tools

1. Built-in Monitoring

# Check DNS server status
systemctl status named
systemctl status systemd-resolved

# View DNS logs
journalctl -u named
journalctl -u systemd-resolved

2. External Monitoring

  • Pingdom: DNS monitoring
  • UptimeRobot: Uptime monitoring
  • Datadog: Infrastructure monitoring
  • New Relic: Application monitoring

3. Custom Monitoring

#!/bin/bash
# Simple DNS monitoring script

DOMAIN="hanyouqing.com"
RESOLVER="8.8.8.8"
THRESHOLD=100  # milliseconds

RESPONSE_TIME=$(dig @$RESOLVER $DOMAIN +stats | grep "Query time" | awk '{print $4}')

if [ $RESPONSE_TIME -gt $THRESHOLD ]; then
    echo "ALERT: DNS response time $RESPONSE_TIME ms exceeds threshold $THRESHOLD ms"
    # Send alert notification
fi

Alerting Best Practices

1. Alert Thresholds

  • Critical: DNS resolution failures
  • Warning: High response times
  • Info: Configuration changes

2. Alert Channels

  • Email: For critical issues
  • Slack: For team notifications
  • PagerDuty: For on-call escalation
  • SMS: For critical outages

3. Alert Fatigue Prevention

  • Escalation Policies: Tiered response
  • Maintenance Windows: Suppress during maintenance
  • Alert Grouping: Combine related alerts

Common DNS Scenarios and Solutions

Understanding DNS Fundamentals

What is DNS and why is it important?

DNS (Domain Name System) is a hierarchical, distributed naming system that translates human-readable domain names into IP addresses. It’s important because:

  • Makes the internet user-friendly
  • Enables load balancing and failover
  • Supports email routing
  • Enables service discovery
  • Provides security through DNSSEC

Recursive vs Iterative DNS Queries

  • Recursive Query: Client asks resolver to find the complete answer. The resolver performs all necessary queries and returns the final result.
  • Iterative Query: Resolver asks authoritative server, which returns the best answer it can provide. The resolver continues querying other servers if needed.

DNS Record Types Overview

Key record types include:

  • A: Maps domain to IPv4 address
  • AAAA: Maps domain to IPv6 address
  • CNAME: Creates alias for another domain
  • MX: Specifies mail exchange servers
  • NS: Delegates subdomain to name servers
  • TXT: Stores text information (SPF, DKIM, etc.)
  • PTR: Reverse DNS lookup
  • SOA: Start of Authority record

Advanced DNS Concepts

How DNS Caching Works

DNS caching works at multiple levels:

  1. Browser Cache: Stores DNS records temporarily
  2. OS Cache: Operating system caches DNS responses
  3. Resolver Cache: ISP or public resolver caches
  4. Authoritative Cache: Server-side caching

TTL (Time To Live) determines how long records are cached. When TTL expires, the record is removed from cache and fresh queries are made.

DNS Propagation Process

DNS propagation is the time it takes for DNS changes to spread across all DNS servers worldwide. It takes time because:

  • Each DNS server has its own cache with different TTL values
  • Changes must propagate through the DNS hierarchy
  • Different geographic locations may see changes at different times
  • TTL values determine how quickly changes propagate

DNSSEC Implementation

DNSSEC (DNS Security Extensions) provides:

  • Data Integrity: Ensures DNS records haven’t been modified
  • Authentication: Verifies the authenticity of DNS responses
  • Non-repudiation: Prevents denial of record existence

Key components:

  • RRSIG: Digital signatures for record sets
  • DNSKEY: Public keys for verification
  • DS: Delegation signer records
  • NSEC/NSEC3: Proof of non-existence

Troubleshooting Common Issues

DNS Resolution Failures

Troubleshooting steps:

  1. Check local DNS: nslookup domain.com
  2. Test different resolvers: dig @8.8.8.8 domain.com
  3. Check DNS propagation: Use multiple DNS servers
  4. Verify record exists: dig A domain.com
  5. Check TTL: dig +trace domain.com
  6. Test from different locations: Use online DNS tools
  7. Check firewall: Ensure port 53 is open
  8. Review DNS logs: Check for errors

Slow DNS Resolution Causes

Common causes:

  • High TTL values: Records cached too long
  • Poor resolver performance: Slow DNS servers
  • Network latency: Geographic distance
  • DNS server overload: Too many queries
  • Recursive queries: Multiple lookups required
  • Cache misses: Frequent uncached queries

Solutions:

  • Optimize TTL values
  • Use faster resolvers (1.1.1.1, 8.8.8.8)
  • Implement DNS caching
  • Use CDN with DNS optimization
  • Monitor and alert on performance

DNS Failover Strategies

  1. Health Checks: Monitor endpoint availability
  2. Multiple A Records: Round-robin with health checks
  3. Geographic DNS: Route based on location
  4. Weighted Records: Distribute traffic by weight
  5. CNAME Records: Point to load balancer
  6. TTL Optimization: Short TTL for quick failover

Performance Optimization

DNS Performance Optimization

Optimization strategies:

  1. TTL Optimization: Balance between performance and flexibility
  2. DNS Caching: Implement at multiple levels
  3. Load Balancing: Distribute queries across servers
  4. Geographic DNS: Route to nearest server
  5. CDN Integration: Use CDN for DNS resolution
  6. Monitoring: Track performance metrics
  7. Record Optimization: Minimize query complexity

Key DNS Metrics to Monitor

  • Query Response Time: < 100ms target
  • Success Rate: > 99.9%
  • Cache Hit Ratio: > 90%
  • Query Volume: Track patterns
  • Error Rate: Monitor failures
  • DNSSEC Validation: 100% for signed zones
  • Uptime: 99.99%+

Security Considerations

Common DNS Security Threats

Common threats:

  1. DNS Spoofing: Injecting false records
  2. Cache Poisoning: Corrupting DNS cache
  3. DNS Amplification: DDoS attacks
  4. DNS Tunneling: Data exfiltration
  5. Domain Hijacking: Unauthorized changes
  6. Phishing: Malicious domains

Prevention:

  • Implement DNSSEC
  • Use secure resolvers
  • Monitor for anomalies
  • Regular security audits
  • Access controls
  • Rate limiting

DNS Infrastructure Security

Security measures:

  1. DNSSEC: Cryptographic validation
  2. Access Controls: Restrict DNS changes
  3. Monitoring: Track suspicious activity
  4. Rate Limiting: Prevent abuse
  5. Firewall Rules: Control access
  6. Regular Updates: Keep software current
  7. Backup: Maintain DNS backups
  8. Audit Logs: Track all changes

Cloud and Modern DNS

Cloud DNS vs Traditional DNS

Cloud DNS advantages:

  • Global Anycast: Fast worldwide resolution
  • High Availability: 99.99%+ uptime
  • Scalability: Handle traffic spikes
  • Integration: Works with cloud services
  • Monitoring: Built-in analytics
  • Security: Advanced security features
  • API: Programmatic management

DNS over HTTPS (DoH) and DNS over TLS (DoT)

  • DoH: DNS queries over HTTPS (port 443)
  • DoT: DNS queries over TLS (port 853)

Benefits:

  • Privacy: Encrypts DNS queries
  • Security: Prevents eavesdropping
  • Censorship Resistance: Harder to block
  • Performance: Can be faster than traditional DNS

Considerations:

  • Centralization: Fewer resolver options
  • Monitoring: Harder to monitor encrypted traffic
  • Compatibility: Not all systems support

Conclusion

DNS is a critical component of modern internet infrastructure that every DevOps and SRE engineer must understand. This comprehensive guide covers everything from basic concepts to advanced troubleshooting, security considerations, and performance optimization.

Key takeaways:

  • DNS is hierarchical and distributed
  • Caching is essential for performance
  • Security requires multiple layers of protection
  • Monitoring and alerting are crucial for reliability
  • Cloud services offer significant advantages
  • Troubleshooting requires systematic approach

Understanding these concepts will make you a more effective engineer when solving real-world DNS issues. Remember to practice with actual DNS queries and familiarize yourself with the tools and commands discussed in this guide.

Home DNS Server Setup: Smart DNS with Geographic Routing

Overview

Building a home DNS server with geographic routing allows you to:

  • Route domestic traffic to local servers for better performance
  • Route international traffic through optimized paths
  • Implement ad blocking and parental controls
  • Monitor and analyze DNS queries
  • Create custom local domains

Architecture

Mermaid Diagram
Rendering diagram...

Implementation Steps

1. Hardware Requirements

Minimum Requirements:

  • Raspberry Pi 4 (4GB RAM) or equivalent
  • 32GB+ microSD card
  • Ethernet connection
  • Power supply

Recommended Setup:

  • Intel NUC or mini PC
  • 8GB+ RAM
  • 128GB+ SSD
  • Gigabit Ethernet

2. Software Stack

Core Components:

  • Pi-hole: DNS server with ad blocking
  • AdGuard Home: Alternative DNS server
  • Prometheus: Metrics collection
  • Grafana: Monitoring dashboard
  • Docker: Container orchestration

3. Installation Script

#!/bin/bash
# Home DNS Server Setup Script

# Update system
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER

# Install Docker Compose
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

# Create project directory
mkdir -p ~/home-dns
cd ~/home-dns

# Create docker-compose.yml
cat > docker-compose.yml << 'EOF'
version: '3.8'

services:
  pihole:
    container_name: pihole
    image: pihole/pihole:latest
    ports:
      - "53:53/tcp"
      - "53:53/udp"
      - "80:80/tcp"
      - "443:443/tcp"
    environment:
      TZ: 'Asia/Shanghai'
      WEBPASSWORD: 'your_secure_password'
      DNS1: '223.5.5.5'
      DNS2: '114.114.114.114'
      DNSSEC: 'true'
      DNSMASQ_LISTENING: 'all'
    volumes:
      - './etc-pihole:/etc/pihole'
      - './etc-dnsmasq.d:/etc/dnsmasq.d'
    restart: unless-stopped
    cap_add:
      - NET_ADMIN

  prometheus:
    container_name: prometheus
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - './prometheus.yml:/etc/prometheus/prometheus.yml'
      - 'prometheus_data:/prometheus'
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--storage.tsdb.retention.time=200h'
      - '--web.enable-lifecycle'
    restart: unless-stopped

  grafana:
    container_name: grafana
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - 'grafana_data:/var/lib/grafana'
    restart: unless-stopped

  node-exporter:
    container_name: node-exporter
    image: prom/node-exporter:latest
    ports:
      - "9100:9100"
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:
EOF

# Create Prometheus configuration
cat > prometheus.yml << 'EOF'
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']

  - job_name: 'pihole'
    static_configs:
      - targets: ['pihole:80']
    metrics_path: /admin/api.php
    params:
      stats: ['']
      auth: ['']
EOF

# Start services
docker-compose up -d

echo "DNS Server setup complete!"
echo "Pi-hole Admin: http://$(hostname -I | awk '{print $1}'):80"
echo "Grafana Dashboard: http://$(hostname -I | awk '{print $1}'):3000"
echo "Prometheus: http://$(hostname -I | awk '{print $1}'):9090"

4. Geographic DNS Routing Configuration

Pi-hole Custom Configuration:

# Create custom dnsmasq configuration
cat > etc-dnsmasq.d/99-geographic-routing.conf << 'EOF'
# China DNS servers for domestic domains
server=/baidu.com/223.5.5.5
server=/taobao.com/223.5.5.5
server=/qq.com/223.5.5.5
server=/weibo.com/223.5.5.5
server=/jd.com/223.5.5.5
server=/sina.com.cn/223.5.5.5
server=/sohu.com/223.5.5.5
server=/163.com/223.5.5.5
server=/126.com/223.5.5.5
server=/sina.cn/223.5.5.5

# International DNS servers for global domains
server=/google.com/1.1.1.1
server=/youtube.com/1.1.1.1
server=/facebook.com/1.1.1.1
server=/twitter.com/1.1.1.1
server=/instagram.com/1.1.1.1
server=/github.com/1.1.1.1
server=/stackoverflow.com/1.1.1.1

# Custom local domains
address=/home.local/192.168.1.1
address=/nas.local/192.168.1.100
address=/printer.local/192.168.1.50

# Ad blocking lists
addn-hosts=/etc/pihole/gravity.list
addn-hosts=/etc/pihole/black.list
EOF

5. Monitoring Dashboard Setup

Grafana Dashboard Configuration:

{
  "dashboard": {
    "title": "Home DNS Server Monitoring",
    "panels": [
      {
        "title": "DNS Query Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(pihole_dns_queries_total[5m])",
            "legendFormat": "Queries/sec"
          }
        ]
      },
      {
        "title": "DNS Response Time",
        "type": "graph",
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(pihole_dns_query_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          }
        ]
      },
      {
        "title": "Blocked Queries",
        "type": "stat",
        "targets": [
          {
            "expr": "pihole_dns_queries_blocked_total",
            "legendFormat": "Blocked"
          }
        ]
      },
      {
        "title": "Top Queried Domains",
        "type": "table",
        "targets": [
          {
            "expr": "topk(10, pihole_dns_queries_total)",
            "legendFormat": "{{domain}}"
          }
        ]
      }
    ]
  }
}

6. Advanced Features

Smart DNS Routing Script:

#!/bin/bash
# Smart DNS routing based on domain analysis

# Function to determine if domain should use China DNS
is_china_domain() {
    local domain=$1
    local china_domains=(
        "baidu.com" "taobao.com" "qq.com" "weibo.com" "jd.com"
        "sina.com.cn" "sohu.com" "163.com" "126.com" "sina.cn"
        "tmall.com" "alipay.com" "zhihu.com" "douban.com"
    )
    
    for china_domain in "${china_domains[@]}"; do
        if [[ "$domain" == *"$china_domain"* ]]; then
            return 0
        fi
    done
    return 1
}

# Function to route DNS query
route_dns_query() {
    local domain=$1
    local query_type=$2
    
    if is_china_domain "$domain"; then
        # Use China DNS servers
        dig @223.5.5.5 "$domain" "$query_type"
    else
        # Use international DNS servers
        dig @1.1.1.1 "$domain" "$query_type"
    fi
}

# Example usage
route_dns_query "baidu.com" "A"
route_dns_query "google.com" "A"

Performance Monitoring Script:

#!/bin/bash
# DNS performance monitoring

# Check DNS response time
check_dns_performance() {
    local resolver=$1
    local domain=$2
    local threshold=100  # milliseconds
    
    local response_time=$(dig @"$resolver" "$domain" +stats | grep "Query time" | awk '{print $4}')
    
    if [ "$response_time" -gt "$threshold" ]; then
        echo "WARNING: DNS response time $response_time ms exceeds threshold $threshold ms for $resolver"
        # Send alert to monitoring system
        curl -X POST "http://localhost:9090/api/v1/alerts" \
             -H "Content-Type: application/json" \
             -d "{\"alerts\":[{\"labels\":{\"alertname\":\"DNSSlowResponse\",\"resolver\":\"$resolver\"},\"annotations\":{\"description\":\"DNS response time is $response_time ms\"}}]}"
    fi
}

# Monitor multiple resolvers
check_dns_performance "223.5.5.5" "hanyouqing.com"
check_dns_performance "1.1.1.1" "hanyouqing.com"
check_dns_performance "8.8.8.8" "hanyouqing.com"

7. Router Configuration

OpenWrt Router Setup:

# Configure router to use home DNS server
uci set dhcp.@dnsmasq[0].server='192.168.1.100'  # Pi-hole IP
uci set dhcp.@dnsmasq[0].noresolv='1'
uci commit dhcp
/etc/init.d/dnsmasq restart

# Configure DHCP to advertise DNS server
uci set dhcp.lan.dhcp_option='6,192.168.1.100'
uci commit dhcp
/etc/init.d/dnsmasq restart

8. Security Considerations

Firewall Rules:

# Allow DNS traffic
ufw allow 53/tcp
ufw allow 53/udp

# Allow web interfaces
ufw allow 80/tcp
ufw allow 443/tcp
ufw allow 3000/tcp  # Grafana
ufw allow 9090/tcp  # Prometheus

# Block external access to monitoring
ufw deny from any to any port 3000
ufw deny from any to any port 9090

SSL/TLS Configuration:

# Generate SSL certificates for secure access
certbot certonly --standalone -d dns.hanyouqing.com

# Configure Nginx reverse proxy
cat > /etc/nginx/sites-available/dns-server << 'EOF'
server {
    listen 443 ssl;
    server_name dns.hanyouqing.com;
    
    ssl_certificate /etc/letsencrypt/live/dns.hanyouqing.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/dns.hanyouqing.com/privkey.pem;
    
    location / {
        proxy_pass http://localhost:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
EOF

Benefits of Home DNS Server

  1. Performance: Faster DNS resolution for frequently accessed domains
  2. Privacy: No DNS queries sent to external providers
  3. Control: Custom domain resolution and ad blocking
  4. Monitoring: Detailed insights into network usage
  5. Security: Block malicious domains and phishing attempts
  6. Customization: Local domain names and custom routing rules

Maintenance and Updates

Automated Updates:

#!/bin/bash
# Automated DNS server maintenance

# Update Pi-hole
docker-compose pull pihole
docker-compose up -d pihole

# Update block lists
docker exec pihole pihole -g

# Update system packages
apt update && apt upgrade -y

# Restart services
docker-compose restart

# Backup configuration
tar -czf dns-backup-$(date +%Y%m%d).tar.gz etc-pihole/ etc-dnsmasq.d/

Additional Resources

YH

Youqing Han

DevOps Engineer

Share this article:

Stay Updated

Get the latest DevOps insights and best practices delivered to your inbox

No spam, unsubscribe at any time