Asset Manager - Cloud Resource Management
Lead Developer
2024
Automated infrastructure asset tracking and management system

Tech Stack

PythonFastAPITerraformAWSPostgreSQLDocker

Problem

Managing cloud infrastructure assets across multiple environments is challenging:

  • Visibility Gap: Hard to track what’s running where and who owns it
  • Cost Leakage: Unused resources sit idle, burning money
  • Compliance Risk: No central view of security configurations
  • Manual Tracking: Spreadsheets get outdated within hours

Solution

Asset Manager is an automated infrastructure asset tracking system that:

  • Discovers cloud resources across multiple accounts/regions
  • Tracks ownership, cost, and configuration
  • Identifies unused or misconfigured assets
  • Provides APIs for programmatic access
  • Integrates with Terraform for IaC workflows

Architecture

┌─────────────────┐
│  Asset Manager  │
│    (FastAPI)    │
└────────┬────────┘

    ┌────┴────┐
    │         │
┌───▼───┐ ┌──▼────┐
│  AWS  │ │  GCP  │  ... (Cloud Providers)
│  API  │ │  API  │
└───────┘ └───────┘
    │         │
    └────┬────┘

    ┌────▼────┐
    │   DB    │  (Asset Inventory)
    └─────────┘

Key Components:

  1. Discovery Engine: Scans cloud accounts for resources
  2. Asset Store: PostgreSQL database for inventory
  3. Cost Tracker: Aggregates spend by resource/team
  4. API Layer: REST endpoints for querying/updating assets
  5. Terraform Integration: Sync IaC state with actual resources

Technical Implementation

Resource Discovery

class AssetDiscovery:
    def discover_aws_assets(self, account_id: str, regions: list):
        assets = []
        
        for region in regions:
            client = boto3.client('resourcegroupstaggingapi', region_name=region)
            
            # Get all tagged resources
            paginator = client.get_paginator('get_resources')
            for page in paginator.paginate():
                for resource in page['ResourceTagMappingList']:
                    asset = {
                        'arn': resource['ResourceARN'],
                        'type': self.parse_resource_type(resource['ResourceARN']),
                        'region': region,
                        'account_id': account_id,
                        'tags': {tag['Key']: tag['Value'] for tag in resource['Tags']},
                        'discovered_at': datetime.utcnow(),
                    }
                    assets.append(asset)
        
        return assets
    
    def enrich_with_cost_data(self, assets: list):
        """Fetch cost data from AWS Cost Explorer"""
        ce_client = boto3.client('ce')
        
        for asset in assets:
            cost_data = ce_client.get_cost_and_usage(
                TimePeriod={'Start': '2024-10-01', 'End': '2024-11-01'},
                Granularity='MONTHLY',
                Filter={'Tags': {'Key': 'ResourceId', 'Values': [asset['id']]}},
                Metrics=['UnblendedCost']
            )
            asset['monthly_cost'] = parse_cost(cost_data)
        
        return assets

Asset API Endpoints

@app.get("/assets")
async def list_assets(
    type: Optional[str] = None,
    owner: Optional[str] = None,
    region: Optional[str] = None,
    unused: Optional[bool] = None
):
    """List assets with optional filters"""
    query = db.query(Asset)
    
    if type:
        query = query.filter(Asset.type == type)
    if owner:
        query = query.filter(Asset.tags['Owner'] == owner)
    if region:
        query = query.filter(Asset.region == region)
    if unused:
        query = query.filter(Asset.last_accessed_at < datetime.now() - timedelta(days=30))
    
    return query.all()

@app.post("/assets/{asset_id}/retire")
async def retire_asset(asset_id: str, reason: str):
    """Mark asset for retirement (generates Terraform destroy plan)"""
    asset = db.query(Asset).get(asset_id)
    
    # Create retirement plan
    plan = terraform.plan_destroy(resources=[asset.terraform_address])
    
    # Update asset status
    asset.status = "pending_retirement"
    asset.retirement_reason = reason
    db.commit()
    
    return {"plan": plan, "asset": asset}

Terraform Integration

class TerraformSync:
    def sync_with_state(self, state_file: str):
        """Compare Terraform state with discovered assets"""
        tf_state = json.loads(read_file(state_file))
        tf_resources = {r['id']: r for r in tf_state['resources']}
        
        discovered = db.query(Asset).all()
        discovered_ids = {asset.cloud_id for asset in discovered}
        
        # Find drift: resources in TF but not discovered
        missing = set(tf_resources.keys()) - discovered_ids
        
        # Find drift: discovered but not in TF (manual changes)
        unmanaged = discovered_ids - set(tf_resources.keys())
        
        return {
            'managed': len(tf_resources),
            'discovered': len(discovered),
            'missing': list(missing),
            'unmanaged': list(unmanaged),
        }

Features

1. Multi-Cloud Discovery

  • AWS: EC2, RDS, S3, Lambda, ECS, etc.
  • GCP: Compute, Storage, Cloud Functions (planned)
  • Azure: VMs, Databases (planned)

2. Cost Tracking

  • Monthly cost per resource
  • Aggregate by team/project/environment
  • Identify cost anomalies
  • Predict upcoming spend

3. Ownership Tracking

  • Tag-based ownership mapping
  • Team assignment and alerts
  • Slack notifications for high-cost resources

4. Unused Resource Detection

  • Last accessed time tracking
  • Idle instance identification
  • Automated recommendations for cleanup

5. Compliance Reporting

  • Security group audits
  • Untagged resource reports
  • Public access detection
  • Encryption status

Use Cases

Cost Optimization

Scenario: Find all EC2 instances unused in the last 30 days

curl "https://api.example.com/assets?type=ec2&unused=true"

Result: Identified 15 idle instances costing $2,400/month → scheduled for termination

Security Audit

Scenario: List all publicly accessible S3 buckets

assets = api.list_assets(
    type="s3",
    filter=lambda a: a.config.get("public_access") == True
)

Result: Found 3 public buckets, alerted owners, restricted access

Terraform Drift Detection

Scenario: Detect manual changes not tracked in Terraform

asset-manager sync --state-file terraform.tfstate

Result: Found 8 manually created resources → imported into Terraform or deleted

Results & Impact

Cost Savings:

  • 💰 $15K/month saved by identifying and removing unused resources
  • 📊 40% visibility improvement across cloud infrastructure
  • ⏱️ 80% reduction in time spent tracking assets manually

Operational Excellence:

  • ✅ Complete inventory of all cloud resources
  • 🔍 Real-time drift detection between code and reality
  • 🛡️ Automated compliance reporting
  • 📈 Cost trending and forecasting

Team Productivity:

  • Before: 4 hours/week manually tracking spreadsheets
  • After: 15 minutes/week reviewing automated reports
  • Developer Satisfaction: Significantly improved

Technical Stack

Backend:

  • FastAPI for REST APIs
  • SQLAlchemy ORM + PostgreSQL
  • Celery for async discovery jobs
  • Redis for task queue

Cloud SDKs:

  • Boto3 (AWS SDK for Python)
  • Google Cloud Python Client
  • Azure SDK (planned)

Infrastructure:

  • Docker for containerization
  • Kubernetes for orchestration
  • Terraform for IaC
  • GitHub Actions for CI/CD

Monitoring:

  • Prometheus metrics
  • Grafana dashboards
  • PagerDuty for alerts

Challenges & Solutions

Challenge 1: Scale

Problem: Scanning 1000+ resources across 20 regions takes hours

Solution:

  • Implemented parallel scanning with rate limiting
  • Cached results with incremental updates
  • Added region-based batching

Challenge 2: Cost Attribution

Problem: AWS Cost Explorer data is delayed by 24 hours

Solution:

  • Use resource tags for instant attribution
  • Fallback to last known costs
  • Daily sync for historical accuracy

Challenge 3: Terraform State

Problem: Multiple state files across teams

Solution:

  • Terraform Cloud integration for remote state
  • State aggregation service
  • Per-team state file discovery

Key Learnings

  1. Tag Everything: Tagging is critical for ownership and cost tracking
  2. Automate Discovery: Manual tracking doesn’t scale
  3. Real-time Alerts: Catch issues before they become expensive
  4. API-First: Programmatic access > manual dashboards
  5. Cost Allocation: Show teams their spend → drives accountability

Future Roadmap

  • Multi-cloud support (GCP, Azure)
  • ML-based cost forecasting
  • Automated remediation (auto-stop idle instances)
  • Integration with CMDB systems
  • Resource lifecycle policies
  • Slack bot for queries (“Show me my team’s resources”)

Repository: github.com/vayux/asset-manager

Tech Stack: Python · FastAPI · PostgreSQL · AWS · Terraform · Docker · Kubernetes

VayuX Technologies: Infrastructure automation and observability tools