Asset Manager - Cloud Resource Management

Problem

Managing cloud infrastructure assets across multiple environments is challenging:

Visibility Gap: Hard to track what’s running where and who owns it
Cost Leakage: Unused resources sit idle, burning money
Compliance Risk: No central view of security configurations
Manual Tracking: Spreadsheets get outdated within hours

Solution

Asset Manager is an automated infrastructure asset tracking system that:

Discovers cloud resources across multiple accounts/regions
Tracks ownership, cost, and configuration
Identifies unused or misconfigured assets
Provides APIs for programmatic access
Integrates with Terraform for IaC workflows

Architecture

┌─────────────────┐
│  Asset Manager  │
│    (FastAPI)    │
└────────┬────────┘
         │
    ┌────┴────┐
    │         │
┌───▼───┐ ┌──▼────┐
│  AWS  │ │  GCP  │  ... (Cloud Providers)
│  API  │ │  API  │
└───────┘ └───────┘
    │         │
    └────┬────┘
         │
    ┌────▼────┐
    │   DB    │  (Asset Inventory)
    └─────────┘

Key Components:

Discovery Engine: Scans cloud accounts for resources
Asset Store: PostgreSQL database for inventory
Cost Tracker: Aggregates spend by resource/team
API Layer: REST endpoints for querying/updating assets
Terraform Integration: Sync IaC state with actual resources

Technical Implementation

Resource Discovery

class AssetDiscovery:
    def discover_aws_assets(self, account_id: str, regions: list):
        assets = []
        
        for region in regions:
            client = boto3.client('resourcegroupstaggingapi', region_name=region)
            
            # Get all tagged resources
            paginator = client.get_paginator('get_resources')
            for page in paginator.paginate():
                for resource in page['ResourceTagMappingList']:
                    asset = {
                        'arn': resource['ResourceARN'],
                        'type': self.parse_resource_type(resource['ResourceARN']),
                        'region': region,
                        'account_id': account_id,
                        'tags': {tag['Key']: tag['Value'] for tag in resource['Tags']},
                        'discovered_at': datetime.utcnow(),
                    }
                    assets.append(asset)
        
        return assets
    
    def enrich_with_cost_data(self, assets: list):
        """Fetch cost data from AWS Cost Explorer"""
        ce_client = boto3.client('ce')
        
        for asset in assets:
            cost_data = ce_client.get_cost_and_usage(
                TimePeriod={'Start': '2024-10-01', 'End': '2024-11-01'},
                Granularity='MONTHLY',
                Filter={'Tags': {'Key': 'ResourceId', 'Values': [asset['id']]}},
                Metrics=['UnblendedCost']
            )
            asset['monthly_cost'] = parse_cost(cost_data)
        
        return assets

Asset API Endpoints

@app.get("/assets")
async def list_assets(
    type: Optional[str] = None,
    owner: Optional[str] = None,
    region: Optional[str] = None,
    unused: Optional[bool] = None
):
    """List assets with optional filters"""
    query = db.query(Asset)
    
    if type:
        query = query.filter(Asset.type == type)
    if owner:
        query = query.filter(Asset.tags['Owner'] == owner)
    if region:
        query = query.filter(Asset.region == region)
    if unused:
        query = query.filter(Asset.last_accessed_at < datetime.now() - timedelta(days=30))
    
    return query.all()

@app.post("/assets/{asset_id}/retire")
async def retire_asset(asset_id: str, reason: str):
    """Mark asset for retirement (generates Terraform destroy plan)"""
    asset = db.query(Asset).get(asset_id)
    
    # Create retirement plan
    plan = terraform.plan_destroy(resources=[asset.terraform_address])
    
    # Update asset status
    asset.status = "pending_retirement"
    asset.retirement_reason = reason
    db.commit()
    
    return {"plan": plan, "asset": asset}

Terraform Integration

class TerraformSync:
    def sync_with_state(self, state_file: str):
        """Compare Terraform state with discovered assets"""
        tf_state = json.loads(read_file(state_file))
        tf_resources = {r['id']: r for r in tf_state['resources']}
        
        discovered = db.query(Asset).all()
        discovered_ids = {asset.cloud_id for asset in discovered}
        
        # Find drift: resources in TF but not discovered
        missing = set(tf_resources.keys()) - discovered_ids
        
        # Find drift: discovered but not in TF (manual changes)
        unmanaged = discovered_ids - set(tf_resources.keys())
        
        return {
            'managed': len(tf_resources),
            'discovered': len(discovered),
            'missing': list(missing),
            'unmanaged': list(unmanaged),
        }

Features

1. Multi-Cloud Discovery

AWS: EC2, RDS, S3, Lambda, ECS, etc.
GCP: Compute, Storage, Cloud Functions (planned)
Azure: VMs, Databases (planned)

2. Cost Tracking

Monthly cost per resource
Aggregate by team/project/environment
Identify cost anomalies
Predict upcoming spend

3. Ownership Tracking

Tag-based ownership mapping
Team assignment and alerts
Slack notifications for high-cost resources

4. Unused Resource Detection

Last accessed time tracking
Idle instance identification
Automated recommendations for cleanup

5. Compliance Reporting

Security group audits
Untagged resource reports
Public access detection
Encryption status

Use Cases

Cost Optimization

Scenario: Find all EC2 instances unused in the last 30 days

curl "https://api.example.com/assets?type=ec2&unused=true"

Result: Identified 15 idle instances costing $2,400/month → scheduled for termination

Security Audit

Scenario: List all publicly accessible S3 buckets

assets = api.list_assets(
    type="s3",
    filter=lambda a: a.config.get("public_access") == True
)

Result: Found 3 public buckets, alerted owners, restricted access

Terraform Drift Detection

Scenario: Detect manual changes not tracked in Terraform

asset-manager sync --state-file terraform.tfstate

Result: Found 8 manually created resources → imported into Terraform or deleted

Results & Impact

Cost Savings:

💰 $15K/month saved by identifying and removing unused resources
📊 40% visibility improvement across cloud infrastructure
⏱️ 80% reduction in time spent tracking assets manually

Operational Excellence:

✅ Complete inventory of all cloud resources
🔍 Real-time drift detection between code and reality
🛡️ Automated compliance reporting
📈 Cost trending and forecasting

Team Productivity:

Before: 4 hours/week manually tracking spreadsheets
After: 15 minutes/week reviewing automated reports
Developer Satisfaction: Significantly improved

Technical Stack

Backend:

FastAPI for REST APIs
SQLAlchemy ORM + PostgreSQL
Celery for async discovery jobs
Redis for task queue

Cloud SDKs:

Boto3 (AWS SDK for Python)
Google Cloud Python Client
Azure SDK (planned)

Infrastructure:

Docker for containerization
Kubernetes for orchestration
Terraform for IaC
GitHub Actions for CI/CD

Monitoring:

Prometheus metrics
Grafana dashboards
PagerDuty for alerts

Challenges & Solutions

Challenge 1: Scale

Problem: Scanning 1000+ resources across 20 regions takes hours

Solution:

Implemented parallel scanning with rate limiting
Cached results with incremental updates
Added region-based batching

Challenge 2: Cost Attribution

Problem: AWS Cost Explorer data is delayed by 24 hours

Solution:

Use resource tags for instant attribution
Fallback to last known costs
Daily sync for historical accuracy

Challenge 3: Terraform State

Problem: Multiple state files across teams

Solution:

Terraform Cloud integration for remote state
State aggregation service
Per-team state file discovery

Key Learnings

Tag Everything: Tagging is critical for ownership and cost tracking
Automate Discovery: Manual tracking doesn’t scale
Real-time Alerts: Catch issues before they become expensive
API-First: Programmatic access > manual dashboards
Cost Allocation: Show teams their spend → drives accountability

Future Roadmap

Multi-cloud support (GCP, Azure)
ML-based cost forecasting
Automated remediation (auto-stop idle instances)
Integration with CMDB systems
Resource lifecycle policies
Slack bot for queries (“Show me my team’s resources”)

Repository: github.com/vayux/asset-manager

Tech Stack: Python · FastAPI · PostgreSQL · AWS · Terraform · Docker · Kubernetes

VayuX Technologies: Infrastructure automation and observability tools