Skip to content

Make 24-hour account cooldown a hard requirement for accurate per-lease billing #70

@chrisns

Description

@chrisns

Summary

When two sequential leases use the same AWS account within a 24-hour window, cost attribution between leases may be inaccurate. The current soft 24-hour cooldown should become a hard requirement when accurate per-lease billing is needed.

Problem

Current Behavior

The system prefers accounts not used in the last 24 hours but will fall back to recently-used accounts if no preferred accounts are available:

// source/lambdas/api/innovation-sandbox/src/innovation-sandbox.ts:898-948
if (!lastCleanupTime || parseDatetime(lastCleanupTime) <= twentyFourHoursAgo) {
  preferredAccounts.push(account);
} else {
  fallbackAccounts.push(account);  // Still usable
}

A warning is logged but the lease proceeds:

"The account acquired for the lease has been used within the last 24 hours and may result in inaccurate cost data"

Why This Causes Inaccurate Billing

  1. Cost Explorer data delay: AWS Cost Explorer data is delayed 8-24 hours. When Lease A terminates, its final totalCostAccrued snapshot may miss costs that haven't appeared in Cost Explorer yet.

  2. No delayed reconciliation: Once a lease terminates, monitoring stops. There's no follow-up to capture costs that appear later.

  3. Gap period attribution: Costs incurred between Lease A's termination and Lease B's start may not be attributed to either lease.

Example Scenario

Timeline:
09:00 - Lease A starts on Account-123
17:00 - Lease A terminates (final cost snapshot: $50)
17:30 - Lease B starts on Account-123
Next day - Cost Explorer shows $20 from Lease A's final hours

Result:
- Lease A shows $50 (missing $20)
- Lease B shows costs from 17:30 onward
- $20 is lost/unattributed

Proposed Solution

Introduce a new "Cooldown" account status (similar to Quarantine) that separates recently-used accounts from the available pool.

How It Works

  1. After lease termination: Account moves to Cooldown status instead of directly to Available
  2. Automatic release: A scheduled process moves accounts from Cooldown to Available after 24 hours
  3. Admin override: Admins can manually move accounts from Cooldown to Available early (accepting billing inaccuracy)

Account State Flow

Active (lease in progress)
    │
    ▼
CleanUp (cleanup running)
    │
    ▼
Cooldown (new state - 24hr wait for Cost Explorer)
    │
    ├─► [24 hours elapsed] ─► Available
    │
    └─► [Admin manual release] ─► Available (with warning logged)

Global Configuration

{
  // ... existing config
  enforceAccountCooldown: boolean  // Default: false for backward compatibility
}

When disabled (default): Current behavior - accounts go directly to Available after cleanup.
When enabled: Accounts go to Cooldown status and wait 24 hours before becoming Available.

Why Global Only

A per-template option wouldn't work: if Template A enforces cooldown but Template B doesn't, Template B could use an account and pollute the billing window for Template A anyway. The cooldown operates at the account level, so enforcement must be global.

Files to Modify

File Change
source/common/data/sandbox-account/sandbox-account.ts Add Cooldown to account status enum
source/common/data/innovation-sandbox-config/innovation-sandbox-config.ts Add enforceAccountCooldown field
source/lambdas/api/innovation-sandbox/src/innovation-sandbox.ts Filter out Cooldown accounts in acquireAvailableAccount(), add admin release endpoint
source/lambdas/account-management/account-cleanup/ Transition to Cooldown instead of Available when config enabled
New Lambda: cooldown-release-handler.ts Scheduled process to move Cooldown → Available after 24 hours
source/infrastructure/lib/ Add EventBridge schedule for cooldown release
Frontend Add Cooldown status display, admin release button, config toggle

Acceptance Criteria

  • Cooldown account status added
  • Global config option enforceAccountCooldown added
  • Cleanup transitions accounts to Cooldown when enabled
  • Scheduled Lambda releases accounts after 24 hours
  • Admin API endpoint to manually release accounts early
  • Frontend shows Cooldown accounts and release action
  • Documentation updated
  • Existing behavior unchanged when setting is disabled (default)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions