CV

For a downloadable version, get in touch.

T. Michael Cornelia

Data center infrastructure leader at OpenAI, driving Stargate program delivery — partnership and delivery from construction through compute readiness for large-scale AI infrastructure. 13 years at Meta scaling the fleet from megawatts to gigawatts; deep operational ownership of GPU training and inference platforms including Zion, Grand Teton, SMC, MTIA, and NVIDIA GB200/GB300 GenAI clusters. Founded and led Meta’s data center AI Operations program; drove a 49% reduction in unplanned downtime on training infrastructure and 70% on inference in year one. Engineer → Director: engineer’s instincts paired with director-level strategic range. Known for building durable, high-performing teams — people follow my leadership across roles, locations, and reorgs. Promoted multiple FTEs into Site Manager and Director-track roles.

In Practice

Daily user of Codex; built and maintain two primary custom agents: EDI (always-on Linux-based Chief-of-Staff) and Glyph (OpenClaw messenger agent via mobile) that manages and directs a team of other specialized agents
Built a daily operational telemetry layer on the OpenAI API: scheduled pulls from production data sources, agents that flag SLA/KPI misses (red/yellow/green), and real-time SEV alerts piped to my workflow
Use Codex daily for executive workflows: morning briefs, weekly metric reviews, project knowledge graphs, work-product synthesis, and team-level rollups

Selected Impact

Regional growth: Scaled Meta South region 754 MW → 1,070 MW (+42%) in 18 months
Reliability: Held South region unavailability below targets on training and inference platforms through a 6x scope increase (2025)
AI Operations program: Led a 49% reduction in training downtime/interruptions, 70% reduction in inference downtime, 180% improvement in diagnostic accuracy (2022)
People: Top-quartile manager engagement scores (2025); promoted 3 FTEs to Site Manager (2024-25)
Compliance: Owned SOC 2, ISO 27001, SOX, and PCI-DSS across a 14-site region; zero audit findings during tenure

Experience

Member of Infrastructure Staff

OpenAI · July 2026 – Present

Serve as OpenAI’s primary delivery interface with strategic infrastructure partners for the Stargate program — OpenAI’s initiative to build the world’s most advanced AI infrastructure ecosystem through next-generation datacenter campuses. Own partnership and delivery from construction through compute readiness to drive operational readiness for large-scale AI compute infrastructure.

Director, Global Operations, South Region

Meta Platforms, Inc. · 2025 – July 2026

Directed all site operations for Meta’s South region: ~24% of the global GPU fleet and total megawatts. Scaled the region 754 MW → 1,070 MW in 18 months (+42%) including a 500+ MW generative AI training cluster; on path to ~2,500 MW by 2027. Supported the turn-up of Hyperion in Louisiana: Meta’s next-generation campus, the size of Manhattan and 5 GW at full build-out.

AI & GPU Infrastructure Operations

Operated GPU training and inference platforms across the South region; established the AI operations knowledge base, dashboards, and weekly leadership cadence adopted as the source of truth across the data center org
Sponsored enterprise-wide quality metrics overhaul: multi-tier composite scoring framework, retired legacy metrics; introduced LLM-based quality assessment of repair tickets, adopted globally
Launched cloud operations pilot across multiple public cloud providers (OCI, GCP), defining the support model for heterogeneous cloud infrastructure

New Region Turn-Up & Commissioning

Delivered Meta’s first all-Turin server region on schedule, proving an accelerated capacity program targeting 80% reduction in fulfillment time
Positioned 2,063 racks at rates up to 750 racks/week, single-day record of 200 racks
Managed continuous turn-up pipeline across liquid-cooled facilities, rapid deployment structures, and leased data centers; 10+ concurrent new builds delivering 150–330 MW per quarter

Organizational Strategy & Workforce Planning

Authored and globally deployed the Leadership Deployment Model: 9-month effort with HR and Legal enabling site operations to support the planned 12+ GW fleet without adding regional leadership headcount
Led convergence of facilities and site operations risk frameworks into a unified Data Center Operations metrics dashboard, the first shared operational data layer between the two orgs
Co-authored ring-aligned restructuring proposal that shifted operations from per-site to infrastructure-ring-based model; 34% management reduction while scaling to gigawatt-class campuses

Team & Talent

Built and retained a team of senior operations leaders with multi-year tenure under my leadership; multiple managers and FTEs followed me across roles and locations as the South region grew
Promoted 3 FTEs into Site Manager roles (2024–25); developed pipeline of next-generation operations leadership
Sustained top-quartile manager engagement scores (2025) through a period of significant org change and rapid growth
Recognized internally for talent magnetism — recruiters and adjacent orgs routinely asked to “borrow” my model

Capacity Planning, Risk & Compliance

Created region-level capacity delivery risk framework combining construction risk with operational signals into a unified executive dashboard
Standardized capacity engineering processes, launched root cause corrective action (RCCA) framework, and established quarterly quality assessments
Managed region against rack turn-up SLO (P90 < 5 days), redeployment SLO (P95 < 4 days), and decommission targets with per-site tracking
Accountable for SOC 2, ISO 27001, SOX (ICFR), and PCI-DSS compliance across all 14 sites; directed the operations staff executing control implementation, evidence collection, and audit readiness, with zero findings throughout my tenure
Supported annual external financial audits (Ernst & Young), including fixed-asset / PP&E existence verification under SOX 404, with zero reportable findings

Government Relations & Community

Represented Meta in state-level legislative advocacy for data center tax incentive preservation
Graduate of Leadership North Carolina; maintain statewide network of leaders across public and private sectors
Managed relationships with government officials, economic development authorities, and community organizations across eight states and growing

Director, Site Operations, Stanton Springs, GA

Meta Platforms, Inc. · 2022 – 2025

Directed site operations for Meta’s Stanton Springs (Newton County, GA) data center campus, one of the fastest-growing campuses in the fleet.

Founded Meta’s data center AI Operations program (2022); year one: −49% training interruptions, −70% inference downtime, +180% diagnostic accuracy; established the dashboards, knowledge base, and weekly leadership cadence now used across the data center org
Delivered the largest single-region capacity increase globally in 2022: 7,005 racks landed and provisioned (32% more than the next closest region), including the fleet’s largest GPU footprint, while maintaining 99.56% server availability
Co-created the new-hardware introduction process (PVT to MP) originally developed for Grand Teton (OCP H100); still in use today across deployments of AMD Instinct, MTIA, GB200, and GB300
Co-authored unplanned-downtime alerting strategy; led authoring of the SEV0 incident response plan for Site Managers, institutionalizing operations continuity across data center sites
Built operational excellence framework (1:1 templates, meeting cadences, analytics SOPs) adopted across all regions globally
Led insourcing business case analyzing contingent vs. FTE economics across 500+ positions, including staffing models for gigawatt-scale sites through 2030
Built government and community relationships across Newton County and the state of Georgia

Data Center Operations Manager, Stanton Springs, GA

Meta Platforms, Inc. · 2018 – 2022

Managed infrastructure operations teams at Meta’s Stanton Springs campus during a period of rapid expansion and new building commissioning.

Owned all capacity-related projects across the campus as it went from dirt to provisioning, including new building commissioning and expansion phases
Directed the turn-up of the largest A100 GPU cluster known to NVIDIA at the time; followed by the largest H100 cluster
Managed cross-functional coordination across construction, production operations, and headquarters teams for turn-up, turn-down, and retrofit execution
Developed operational processes and standards subsequently adopted at other sites
Stood up and executed the site’s compliance controls (SOC 2, ISO 27001, SOX, PCI-DSS) as the programs came into scope, owning evidence collection and audit readiness; zero findings
Mentored individual contributors across multiple technical teams

Data Center Operations Manager, Forest City, NC

Meta Platforms, Inc. (formerly Facebook) · 2013 – 2018

Led infrastructure teams at Meta’s North Carolina campus during rapid fleet expansion.

Owned all capacity-related projects across the campus, including commissioning of the site’s third data center building
Managed cross-functional coordination across construction, production operations, and headquarters teams for turn-up, turn-down, and retrofit execution
Developed capacity processes adopted fleet-wide; mentored individual contributors across multiple teams
Built local government and community engagement relationships at city and county levels

Earlier Career

Systems Engineer & Architect, SUM/IT Systems · 2004 – 2013
Designed and deployed systems solutions (Unix, Linux, VMware, Solaris, Windows Server) for SMB customers. Co-created the company’s cloud offering and managed full customer lifecycle from scoping through long-term support.

VP, Operations & Information Systems, The School Box, Inc. · 2000 – 2013
Directed technology operations for a multi-state retailer (~400 employees). Managed all infrastructure, partnered with CEO/CFO on strategy and budgets, and built the organization’s primary technical strategy and knowledge management platform.

Core Competencies

Data Center Operations · Hyperscale Infrastructure · GPU Fleet Operations · AI/ML Training Clusters · New Site Commissioning · Liquid Cooling · Capacity Planning & Delivery · SLO Management · Agentic AI for Operations · Multi-Cloud (OCI, AWS, GCP) · Colocation Management · Workforce Strategy · Organizational Design · Quality Systems · Operational Excellence · Compliance & Audit Readiness (SOC 2, ISO 27001, SOX, PCI-DSS) · Legislative Advocacy · Government Affairs · Community Development

Education & Certifications

Leadership North Carolina, statewide leadership program (graduate)

University of Georgia, Business Management coursework

Red Hat Certified Engineer (RHCE)

Red Hat Certified System Administrator (RHCSA)