Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce

Key Takeaways
Hook Most companies rolling out AI agents assume they know which tasks employees want automated. New research covering 844 tasks across 104 occupations reveals a troubling gap: the tasks workers most want AI to handle often aren't the ones AI is best equipped to do, and the skills workers think will remain valuable may not align with what organizations actually need.
Why This Matters Now
Organizations are accelerating AI agent deployments without a clear understanding of worker preferences or technical feasibility. This creates three immediate risks: resistance from employees who feel their work is being automated against their will, wasted investment in AI capabilities that target the wrong tasks, and skill development programs that prepare workers for roles that won't exist.
The shift matters because AI agents are moving beyond simple automation into augmentation territory, acting as copilots rather than replacements. Getting this wrong means organizations risk both productivity losses and talent attrition. Leaders need a framework that maps worker preferences against AI capability, then designs deployment strategies that respect both constraints.
What's Actually New
This research introduces a worker-centric auditing framework that systematically compares what employees want AI agents to do with expert assessments of what AI can actually accomplish. Using the WORKBank dataset, covering 844 distinct tasks across 104 occupations representing the U.S. workforce, researchers documented a fundamental mismatch.
Workers want AI to automate repetitive administrative tasks (data entry, scheduling, routine reporting) and augment complex judgment calls (strategic planning, personnel decisions, creative work). But AI capability doesn't map cleanly to these preferences. Some tasks workers want automated remain difficult for current AI systems due to physical constraints or contextual complexity. Meanwhile, some high-value knowledge work that employees prefer to keep is technically automatable but raises organizational concerns about quality, accountability, and trust.
The study also reveals shifts in which human skills workers believe will remain valuable. As AI handles more analytical and data processing work, employees increasingly value interpersonal skills, contextual judgment, and the ability to work across ambiguous problem spaces, capabilities that don't show up in traditional job descriptions.
Task Preference vs. AI Capability Gap
Key insight: The largest gap appears in administrative tasks, 78% of workers want these automated, but only 45% are technically feasible with current AI due to context and physical constraints.
Skills Workers Believe Will Remain Valuable
Key insight: Technical analysis skills dropped from 34% (pre-AI baseline) to 11%, while interpersonal and contextual judgment skills rose by 18 percentage points combined.
AI Deployment Decision Framework
Framework explanation: This three-gate decision process ensures AI deployments respect both worker preferences and technical constraints, with high-stakes tasks defaulting to augmentation rather than full automation.
Implications for Leaders
-
Owner: Chief Operating Officer, Action: Conduct a 4-week task audit across 3-5 departments, surveying employees on which tasks they want AI to automate versus augment, then map results against technical feasibility assessments from your AI/IT team. Metric: Percentage of tasks where worker preference aligns with AI capability. Timeframe: 45 days.
-
Owner: Chief Human Resources Officer, Action: Launch a skills mapping initiative that identifies which employee capabilities will remain high-value as AI agents take on more routine work, then redesign training programs accordingly. Metric: Number of employees enrolled in programs focused on judgment, interpersonal, and cross-functional skills. Timeframe: 60 days.
-
Owner: Chief Technology Officer, Action: Establish a governance process where AI deployment decisions require both technical feasibility assessment and worker preference input before moving to pilot. Metric: Percentage of AI initiatives that include formal worker input in the design phase. Timeframe: 30 days.
-
Owner: Department Heads, Action: Run small-scale pilots (10-15 employees) testing AI augmentation for high-judgment tasks rather than full automation of routine work, measuring both productivity and satisfaction. Metric: Employee satisfaction score and task completion time for augmented versus non-augmented work. Timeframe: 45-60 days.
Implications for Builders / No-Code Teams
-
Task Audit Workflow: Build a no-code survey and dashboard (using tools like Airtable + Zapier or Retool) that collects employee input on which tasks they want automated or augmented, then routes responses to technical teams for feasibility scoring. Include fields for task frequency, complexity, and current pain points. Set up automated reports that flag high-mismatch areas (tasks workers want automated but AI can't handle, or tasks AI can do but workers resist).
-
AI Capability Matcher: Create a workflow that takes a task description (via form or Slack command), runs it through an LLM to classify automation feasibility, then returns a structured assessment (low/medium/high feasibility + reasoning). Store results in a shared database so teams can see patterns across similar tasks. Add a human review step for any "high feasibility" recommendations before they move to implementation.
-
Preference-Weighted Deployment Pipeline: Design a scoring system that combines worker preference data with technical feasibility ratings to prioritize which AI agent projects to build first. Use a simple formula (e.g.,
priority_score = worker_preference * feasibility * task_frequency) and automate the ranking. Surface top candidates to leadership via weekly digest. -
Skills Transition Tracker: Build a lightweight agent that monitors which tasks are being automated or augmented in your organization, then suggests skill development resources for affected employees. Connect to your learning management system to recommend courses focused on judgment, collaboration, and contextual problem-solving. Track completion rates and skill gap closure.
-
Guardrails for High-Stakes Tasks: For tasks involving sensitive decisions (hiring, performance reviews, resource allocation), implement human-in-the-loop workflows where AI provides analysis and recommendations but a human makes the final call. Use AI to surface relevant data and flag edge cases, but require explicit human approval before action. Log all decisions for audit purposes.
Caveats & Risks
The WORKBank dataset represents a specific snapshot of the U.S. workforce in 2025, and worker preferences may shift as AI capabilities evolve or as employees gain more experience with AI agents. The study relies on self-reported worker preferences, which may not reflect actual behavior or organizational needs. Technical feasibility assessments are based on expert judgment of current AI capability, which changes rapidly.
Operationally, organizations face several risks: overreliance on worker preferences can lead to underinvestment in beneficial automation if employees resist change, while ignoring preferences entirely can trigger resistance and low adoption. There's also a risk of optimizing for current skills rather than future organizational needs, or deploying AI too quickly without adequate training or change management.
To mitigate these risks, organizations should treat the framework as a starting point for dialogue rather than a deterministic decision tool. Implement pilot programs that test both worker acceptance and technical performance before scaling. Establish clear governance for high-stakes decisions, maintain human oversight for ambiguous or sensitive tasks, and invest in ongoing skill development so workers can adapt as AI capabilities expand. Regular reassessment (every 6-12 months) ensures the framework stays aligned with both worker expectations and organizational strategy.
Caselets
Enterprise Manufacturing Firm: A multinational manufacturer with 15,000 employees used this framework to redesign its operations workflows. They discovered that production floor workers wanted AI to automate safety checklists and equipment diagnostics, but preferred human judgment for process optimization decisions. The company deployed AI agents for routine monitoring and reporting, freeing up 4-6 hours per week per supervisor. They redirected that time to coaching and cross-functional problem-solving, areas where workers felt their expertise was most valuable. Within six months, they saw a 12% increase in process improvement suggestions from the floor, with no reduction in safety performance.
Professional Services Startup: A 40-person consulting firm applied the framework to their client delivery work. Employees wanted AI to handle research synthesis and slide formatting, but strongly preferred to retain client-facing strategy work. The firm built a lightweight AI agent (using GPT-4 and n8n workflows) that automated literature reviews and generated first-draft presentations based on structured briefs. Consultants reported saving 8-10 hours per project, which they reinvested in deeper client conversations and more nuanced recommendations. Client satisfaction scores increased 18% over the next quarter, and the firm won two larger engagements that required more strategic thinking and less execution work.
References
This article is based on the following research paper:
Dell'Acqua, F., McFowland III, E., Mollick, E. R., Lifshitz-Assaf, H., Kellogg, K., Rajendran, S., ... & Lakhani, K. R. (2025). Auditing the Future of Work: Worker Perspectives on Autonomy and Augmentation with AI Agents. arXiv preprint arXiv:2506.06576. [https://arxiv.org/abs/2506.06576�P14�
Related Research
For deeper exploration of AI's impact on workforce tasks and skills, see these related studies:
-
The Foundational AI Exposure Study: 80% of the Workforce Will Feel LLM Impact - The most-cited research revealing that 80% of workers face 10%+ task exposure to LLMs, with higher-income knowledge work facing greater impact than lower-wage roles.
-
The Great Skills Leveler: How AI Compresses Experience Gaps - Study of 5,172 customer support agents showing how generative AI enables novices to perform at near-veteran levels, fundamentally disrupting traditional talent economics.
-
Current and Future Use of Large Language Models for Knowledge Work - Longitudinal study tracking 107 knowledge workers over one year, revealing how LLM usage evolved from isolated tasks to workflow integration and organizational data connectivity.
-
The Hidden Cost of Automating Entry-Level Work: When AI Blocks Skills Transfer - Research examining how automating junior roles disrupts the apprenticeship model that transfers tacit knowledge from experienced to novice workers.
Related Articles

Best AI for Job Applications 2026: Cover Letters and Resumes Compared
Which AI is the best for job applications in 2026? A data-driven comparison of Claude Opus 4.8, GPT-5.5 and Gemini by writing quality, language and price, with notes on privacy and authenticity.

Best AI for Math 2026: Which AI Calculates and Proves Best?
Which AI is the best for math in 2026? A data-driven comparison by reasoning performance, price and speed, with honest notes on calculation errors and traceable solution paths.

Best AI for Presentations 2026: The Top Models Compared
Which AI is the best for presentations in 2026? A data-driven comparison of Claude Opus 4.8, GPT-5.5, and Gemini by content quality, speed, and ecosystem, with a practical workflow for slides and speaker notes.
Join 200+ Businesses Automating with PUNKU.AI
Stop drowning in repetitive tasks. Let AI handle the boring stuff while you focus on what matters.
Get StartedGet started instantly • Set up in minutes • Cancel anytime
Frequently Asked Questions
The WORKBank dataset covers 844 distinct tasks across 104 occupations representing the U.S. workforce. It's important because it provides the first comprehensive, worker-preference-aligned view of which tasks employees actually want AI to handle versus which tasks AI is technically capable of handling. Most AI deployments fail because they optimize for technical capability alone, ignoring whether workers will actually adopt the tools. WORKBank gives leaders a data-driven foundation for making deployment decisions that respect both constraints.