AI LLM Models Take on Work
I've always been fascinated by how large language models "think" about our work. So, I decided to run a little experiment. I gave a GPT model (gpt-4o-mini) a pretty unique task: to go through a massive list of job postings and score each one from 0 to 100. But instead of the usual stuff like salary or experience, I gave it three abstract criteria to judge by: autonomy, innovation, and technical challenge.
I think the results offer a fascinating look into how an AI interprets our professional world. It's a raw, unfiltered perspective, free from my own biases, that shows what the model has learned from the mountains of text it was trained on. It’s a peek into its view of the world.
The Predictable, and Revealing, Hierarchy
When I grouped the scores by job category, a clear pattern emerged. It was pretty much what you'd expect: roles like engineering management, machine learning, and product management shot to the top, mostly scoring in the 85-90 range. Down at the bottom were jobs like delivery drivers and brand reps, hovering between 55 and 65.
This hierarchy makes a lot of sense. The model, when asked to look for autonomy, innovation, and technical challenge, essentially recreated the modern economy's perceived value ladder.
- The Top Tier: An Engineering Manager at Coinbase (Score: 90) or a Member of Technical Staff - Data Platform Lead(Score: 90) at Basis AI are good examples. These jobs are packed with technical challenge, require a ton of autonomy, and are all about innovation.
- The Foundation: On the flip side, a job as a delivery driver or a commercial truck driver operates under a completely different set of rules. The technical challenge is lower, the tasks are often repetitive, leaving less room for autonomy, and the main goal isn't really innovation. These roles are built for efficiency, not creative freedom.
This first pass confirmed something cool: the model has a solid internal map of the professional world. It figured out that the jobs we often consider "high-impact" are the ones that let people think for themselves and build new things.
Digging into the Details
This is where it gets really interesting for me. The model wasn't just making broad generalizations; it was picking up on some surprisingly subtle details.
The AI Premium
Even within the high-scoring engineering category, there was a pecking order. AI and Machine Learning roles consistently outscored traditional software engineering jobs. For instance:
- Engineering Lead, AI Product at The Browser Company (Score: 85)
- Senior Data Scientist at ShiftKey (Score: 85)
- Software Expert – AI/ML Architecture (Expert Role) at Bosch Global Software Technologies Private Limited (Score: 85)
When Context is Everything
The most compelling outliers are the ones that force you to rethink a category entirely. While most driver roles scored low, I found a "Senior Heavy Truck Driver" at Halliburton that scored an 80 placing it on par with senior software engineers. The model understood this isn't a simple delivery job. It's a highly specialized role in the demanding energy sector.
This single data point shows the model’s ability to look past a simple title and infer the true nature of the work.
Across Industries: High-Score Standouts
Beyond the usual suspects in tech, the model identified many roles across diverse fields that scored in the mid 80s to 90. These examples show how widespread the qualities of autonomy, innovation, and technical challenge can be found in the job market:
- Radiation Oncology Senior Dosimetrist – University of Maryland Medical System (Score: 85): In the new Stoler Center for Advanced Medicine, this role involves developing sophisticated cancer treatment plans using cutting edge technology.
- Networking Architect – Optics – OpenAI (Score: 90): At the intersection of hardware and AI, this role defines OpenAI’s next-generation optical networking for AI supercomputers. It involves building out hyperscale systems with cutting-edge photonic technology. It's a pioneer position blending deep technical challenge with innovation in infrastructure, hence the top-tier score.
- Lead Infant Teacher – Guidepost Montessori (Score: 85): An education role scored on par with senior tech jobs. Montessori teaching emphasizes independence and a prepared environment for exploration. In this role, the teacher (or "guide") carefully designs a classroom that lets children choose their own activities and learn at their own pace. The model may have rewarded the autonomy and innovative child-led philosophy inherent in this job.
- Senior Nuclear Scientist – Xcimer Energy (Score: 85): Working on fusion energy at a startup, this role is explicitly about “groundbreaking” technology that could redefine the future of energy. It involves high level research in inertial fusion, simulations, and materials under extreme conditions. This shows that world-changing technical challenges, even outside software, get recognized and scored highly by the model.
- Strategic Partnerships Manager (VoD) – Apple Vision Pro (Score: 85): A role at Apple that bridges content and technology for the Vision Pro (Apple’s AR/VR headset). It requires a mix of creative insight and technical knowledge of immersive media. The job focuses on defining new video-on-demand experiences in spatial computing and working with partners to push the medium forward. It’s a clear case of innovation at the cutting edge of entertainment tech.
- Spaceflight Physician – Vast (Score: 85): Medicine meets aerospace. This physician role works with engineers on designing space station life support systems and ensuring astronaut health on long duration missions. The description makes it clear how unique this job is, even involving contributions to spacecraft engineering and crew training for a future space station. The model saw the high autonomy and novel technical challenges in pioneering the field of space medicine.
- Soft Goods Research Developer – Meta (Score: 85): A Meta (Oculus) role that involves developing advanced textiles for robotics and wearable devices. It combines material science with electromechanical design, using 3D knitting and novel fabrics to build the future of soft robotics. This unusual mix of fashion and tech R&D exemplifies innovation, which the model duly rewarded.
- Division Chair – Head and Neck Surgery, Mayo Clinic (Score: 85): A top medical leadership position leading a multidisciplinary surgical oncology team. This role involves not only performing complex surgeries but also driving research, mentorship, and the strategic growth of the division. High autonomy, influence, and the push for medical innovation place it among the upper tier scores, even though it's in healthcare.
So, What's the Point? A Mirror and a Map
This whole experiment gave me two big takeaways.
First, the AI is a mirror. It might not have its own opinions, but it perfectly reflected the values it found in our own writing. It learned that as a society, we tend to value creation over repetition and autonomy over being told what to do. The scores are a data driven look at the cultural premium we place on jobs that push boundaries, especially through tech.
Second, and I think this is the really cool part, this process can be a map. It points to a new way of looking for work. Imagine if you could filter jobs not by title, but by asking, "Show me roles with the most autonomy" or "Find me a job where I can really innovate." It could change the job hunt from a keyword matching game to a search for work that truly fits you. For anyone curious about what's out there, this could be a powerful tool for discovering interesting and fulfilling work.
We are incorporating a way to browse jobs by spark_score as first step and will be included in generic search as we get more confidence on data's usefulness.