Personal AI portfolio

NxtWave.ai: Doug's AI Project Lab

AI tools, automation experiments, desktop apps, and robot ideas I am building, testing, and learning from.

Not an agency. Not a startup pitch. Just a living portfolio of practical and sometimes weird projects I am working on.

Current focus AI Lab applied systems in progress
Build Applied AI tools that turn messy workflows into usable systems
Model Machine learning, forecasting, dashboards, and automation experiments
Explore Desktop apps, privacy workflows, product prototypes, and robotics concepts

Selected toolkit

AI / ML LLM workflows, machine learning, forecasting, computer vision
Build Python, SQL, Google Cloud, Vercel, Electron apps
Workflow Automation, dashboards, privacy workflows, robotics concepts

About me

A personal project lab for applied AI, automation, and creative problem solving.

I am Douglas Bobeck, an applied AI builder with a background in intelligence analysis, operations, automation, and program management, currently pursuing a master's degree in data science. I use NxtWave.ai to document the systems, prototypes, and experiments I build while exploring how AI can solve practical problems.

This is not a traditional company website. It is a living portfolio of builds that show how I think, design, test, and turn scattered ideas or workflows into working software. The tone is personal because the work is personal: I build real things, experiment fast, and keep improving the work as I go.

ModePersonal AI portfolio
EducationCurrently pursuing a master's in data science
MIT Professional Education Leveraging AI for Effective Decision Making View certificate
FocusApplied AI, ML, automation
StylePractical builds, fast prototypes

Featured builds

A few selected builds from a broader set of AI and automation work.

These are representative examples, not the full list. I use this section to showcase practical AI workflows, operational dashboards, desktop tools, privacy ideas, and experimental systems that turn scattered work into useful software.

ProjectDesk desktop workspace

A desktop AI workspace for files, apps, and everyday project work

ProjectDesk is a desktop-first AI workspace built for nontechnical solo builders, small business operators, and people who work with files every day. It wraps Codex-style app building and automation inside a more casual interface, so users can create apps, organize folders, edit files, prepare reports, run tasks, and manage project memory without using terminal, Git, or raw developer tools.

Because ProjectDesk runs from the desktop, it can work directly with approved local folders and files. Private document workflows can mask sensitive details locally before AI review, helping users protect names, numbers, client details, sheet names, tabs, and other real data before using OpenAI Codex.

  • What it does: creates apps, files, folders, reports, summaries, packaged desktop apps, and local outputs from plain-language requests.
  • Who it is for: nontechnical users who know what they want built, organized, or automated, but do not want to learn developer tools first.
  • Why I built it: Codex and Claude Code are powerful, but they can still feel intimidating to people who do not work in terminals or code editors. ProjectDesk makes that kind of power feel more approachable.
  • Real test example: in one test, ProjectDesk organized hundreds of loose desktop files into a cleaner managed folder structure. In other tests, it created small apps, games, workflow tools, and installable desktop app packages from prompts.
  • Privacy approach: ProjectDesk uses OpenAI Codex under the hood, but includes a local masking layer so users can replace sensitive data before files are sent for AI review.
  • Memory approach: ProjectDesk includes visible, local project memory that users can view, use, edit, open, refresh, or delete. Full chat conversations are not treated as hidden memory.
  • Permission model: Default Mode explains major actions first and asks for approval. Full Access Mode is available for trusted workflows after the user acknowledges a clear warning.
  • Output: ProjectDesk creates inspectable local project files, folders, README instructions, and installable desktop app packages such as Mac .dmg and Windows .exe builds.
  • Status: fully operational private beta. Currently being tested by a small group before public release at MyProjectDesk.ai. Mac signing is active; Windows signing, Google verification, final beta fixes, and release polish are in progress.
  • What I learned: making AI useful is not just about model capability. Privacy, permissions, local file control, memory transparency, clear outputs, and user confidence matter just as much.
Desktop + Web
Electron desktop app Next.js web layer Vercel production hosting Desktop-to-web account connection Local bridge
AI + Task Execution
OpenAI Codex API Codex-style local task execution Plain-language workflows Local project generation Mac and Windows app packaging
Files + Automation
Approved workspace folders Local file creation and editing Folder organization and cleanup README generation Scheduled tasks Chat-based activity trail
Privacy + Memory
Local document masking Masking vault / restore package Local hidden-wiki memory Visible Memory tab User-editable preferences Local chat/action history Optional cloud sync
Cloud + Account Layer
NextAuth login Google sign-in Google Drive, Gmail, Calendar permissions Supabase Postgres through Vercel Token usage tracking Billing/payment status Web dashboard Desktop Account page
  1. Sign in and connect your workspace: open ProjectDesk on the desktop, sign in, and connect an approved local folder or workspace.
  2. Choose your permission mode: use Default Mode when you want ProjectDesk to explain major actions and ask before making changes. Use Full Access Mode only after acknowledging the warning.
  3. Ask in plain English: ask ProjectDesk to build an app, organize files, clean up a folder, summarize documents, prepare a report, send an email, schedule something, or update an existing project.
  4. Review the plan and approve important actions: in Default Mode, ProjectDesk explains major file moves, deletes, renames, overwrites, emails, calendar updates, or connected actions before proceeding.
  5. Get real local output: ProjectDesk creates files, folders, summaries, app projects, README instructions, and installable desktop packages in the approved workspace.
  6. Keep improving the work: return later and ask ProjectDesk to revise the same app, update files, reuse saved project memory, or continue from an existing project folder.

Call metrics forecasting system

Production call-center forecasting and staffing-risk dashboard

A production-deployed call-center forecasting system built to help operations understand expected demand, staffing risk, and high-volume spike patterns. The tool uses machine learning, historical call activity, calendar context, intraday correction, and staffing guardrails to turn manual daily reporting into a clearer operational planning workflow.

I built this to reduce repetitive forecasting work for a real operations team and help one person move from spreadsheet-heavy daily reporting to a model-supported dashboard view of call demand, staffing risk, forecast accuracy, and spike behavior. The system was deployed through Google Cloud Run and delivered to the client for operational use.

  • What it does: forecasts hourly call volume, shows staffing risk, compares predictions to actuals, identifies spike patterns, and supports clearer call-center planning.
  • Why I built it: the original process required several hours of repetitive daily manual reporting. I wanted to replace that with a production tool that could give operations a faster and clearer view of expected demand.
  • Outcome: reduced multiple hours of daily manual work for one person while improving visibility into demand patterns, high-volume risk, and staffing coverage.
  • Real model work: the system separates the expected forecast from the staffing-protected forecast. The expected forecast predicts likely call volume. The protected forecast adds guardrails for high-volume and very-high-volume risk so staffing decisions are not based only on average accuracy.
  • Testing result: the expected forecast performed well for normal hourly demand, while the protected staffing layer reached roughly 97% overall coverage and stronger coverage during high-volume periods.
  • What I learned: a forecast can be accurate on average and still fail operations if it underpredicts spike periods. For staffing, the most useful system is not just the model with the lowest error; it is the workflow that helps people understand when the forecast may not be safe enough.
  • Status: production-deployed client tool. Delivered through Google Cloud Run for operational use.
Modeling + Forecasting
XGBoost regression XGBoost surge-risk classification Random Forest intraday correction Scikit-learn pipelines Optuna tuning experiments Joblib model artifacts Chronological train/test splitting Complete-date evaluation approach
Data + Feature Engineering
Python Pandas NumPy CSV exports Excel/source file parsing Hourly call-volume tables Historical call activity Calendar and closure features School-break and reopen context Monday and high-volume risk features Day-ahead safe message/context features
Operations Logic
Expected hourly forecast Protected staffing forecast Recommended agents by hour High-volume detection Very-high-volume detection Spike-risk scoring Dangerous miss analysis P80/P90/P95/P97/P99 staffing guardrails Intraday remaining-day forecast updates
Dashboard + Reporting
Forecast tables Actual-vs-predicted review Worst-day and worst-hour diagnostics Feature-importance reports Error-by-hour reports Error-by-weekday reports Volume-bucket diagnostics CSV output files Dashboard-ready forecast snapshots
Deployment + Cloud
Google Cloud Run production deployment Google Cloud services Forecast snapshot outputs Client handoff documentation Production model artifacts Operational dashboard workflow
  1. Load historical contact-center activity: the system starts with historical call-center data, including hourly call volume, forwarded calls, message context, operating-day details, and calendar information.
  2. Prepare the forecasting table: the data is cleaned, checked for date/day mismatches, organized into hourly records, and prepared with features that are safe for day-ahead forecasting.
  3. Generate the expected forecast: the model predicts expected hourly call volume so the operations team can see what normal demand is likely to look like.
  4. Add staffing protection: the system applies staffing guardrails to the expected forecast so high-volume and very-high-volume periods are less likely to be underplanned.
  5. Review spike and risk context: the dashboard helps explain where risk is higher, including Mondays, reopen days, school-break timing, January backlog patterns, and other high-volume signals.
  6. Compare predictions to actuals: forecasts are compared against actual call volume so the team can review accuracy, identify worst misses, and understand where the model performs best or needs more operational signal.
  7. Update during the day: intraday correction uses observed morning volume to revise the remaining-day outlook and support same-day staffing adjustments.
  8. Export and use the results: the system produces dashboard-ready tables, CSV outputs, forecast snapshots, staffing context, and handoff documentation for operational use.

TelCo analyst workflow system

High-volume telecom data analysis with local-model privacy controls

A high-volume telecom analysis workflow built during a one-month operational project where two analysts needed to review tens of thousands of daily telecom records within a 24-hour cycle. Their findings helped guide the next day's prioritization, so the workflow had to be fast, repeatable, and useful under real operational pressure.

After reviewing the project scope, I recognized that the manual process would not be feasible with the available analyst capacity. I built this system on my own initiative to turn raw telecom activity into structured metrics, searchable summaries, relationship views, translated message review, and analyst-ready outputs.

Because the data was sensitive, the workflow used local models where AI-supported processing was needed. This allowed translation, keyword review, text processing, and exploratory analysis to happen without relying on external AI services for sensitive telecom records.

The tool became a critical enabler for the project. Without it, the required daily review cycle would not have been realistic for a two-analyst team.

  • What it does: ingests telecom activity data, normalizes subscriber identifiers, analyzes calls, SMS, and web-visit records, generates subscriber-level metrics, maps relationships, highlights keyword activity, translates message content locally, and produces analyst-ready outputs.
  • Why I built it: the project required two analysts to review very large daily datasets within 24 hours so their findings could support the next day's prioritization cycle. A manual workflow would not have scaled.
  • Outcome: made a one-month high-volume analysis project operationally feasible by converting raw daily telecom records into structured metrics, reports, searchable tables, and clearer analyst review workflows.
  • Privacy approach: used local models for sensitive AI-supported processing so telecom records, message content, and analytical context did not need to be sent to external AI services.
  • Real operational value: reduced the burden of manually sorting through tens of thousands of daily rows and helped analysts focus on patterns, relationships, outliers, activity summaries, and priority leads instead of raw spreadsheet review.
  • Core analysis areas: call behavior, inbound/outbound activity, SMS activity, keyword hits, local translation, web-visit/IPDR activity, panel usage, after-hours behavior, outlier calls, peer relationships, and network centrality.
  • What I learned: in high-tempo data projects, the real value is not just analysis accuracy. It is building a repeatable system that turns raw records into decision-ready outputs fast enough to affect the next operational cycle.
  • Status: used during a one-month operational project. Built independently to support daily analyst production.
Data Processing
Python Pandas NumPy Excel ingestion CSV exports SQLite database Incremental table loading Subscriber normalization MSISDN parsing and cleanup Scientific-notation phone number handling
Telecom Analysis
Call-detail record processing SMS record processing Web-visit / IPDR processing Inbound and outbound activity metrics Peer/contact extraction Panel usage tracking After-hours activity detection Weekend activity ratios Duration outlier detection Reciprocity scoring Subscriber-level summary tables
Network + Relationship Analysis
NetworkX Directed call graph Degree centrality Betweenness centrality Closeness centrality Eigenvector centrality Clustering coefficient Top contacts by call count Relationship and peer mapping
Local AI + Privacy
Local model workflows Helsinki-NLP translation model Transformers pipelines Local SMS translation Local keyword and text review Local exploratory analysis No external AI service for sensitive text
Language + Text Review
SMS keyword search Message translation to English Language detection Translated SMS conversation review Keyword-focused message sections
Reporting + Analyst Outputs
Word report generation Profile-style subscriber reports Flat summary CSV Normalized SQL tables Call summary exports SMS summary exports Web visit exports Dashboard-ready outputs Analyst handoff files
Dashboard + Query Layer
Streamlit interface SQLite-backed analytics database Natural-language Text2SQL explorer Saved prompt-to-SQL knowledge base Downloadable CSV query results Knowledge base review mode
Experimental / Research Components
PyTorch DQN exploration Message embedding experiments Clustering experiments DBSCAN Cosine similarity Archived reinforcement-learning prototypes Archived clustering prototypes
  1. Receive daily telecom data: the team received large daily telecom datasets that needed to be reviewed within a 24-hour cycle.
  2. Load and normalize records: the system imported call-detail records, SMS data, subscriber lists, and web-visit/IPDR records. Subscriber identifiers were cleaned and normalized so records could be joined reliably across sources.
  3. Build subscriber-level metrics: the tool calculated inbound and outbound call counts, average duration, missed-call rates, weekend activity, after-hours activity, outlier calls, SMS counts, keyword hits, unique contacts, panel usage, and web-visit activity.
  4. Map relationships: the system created relationship metrics and graph-based indicators, including contact lists, reciprocal communication, centrality measures, clustering coefficients, and top peer relationships.
  5. Translate and review text locally: SMS content could be translated locally into English, searched for analyst-defined keywords, and grouped into conversation-style sections for faster review without sending sensitive text to external AI services.
  6. Generate analyst-ready outputs: the workflow produced profile reports, CSV summaries, normalized SQL tables, and dashboard-ready outputs so analysts could review structured findings instead of manually sorting through raw spreadsheets.
  7. Query the data: a Streamlit-based explorer allowed users to query the SQLite analytics database, reuse saved prompt-to-SQL examples, and export result tables.
  8. Support next-day prioritization: the outputs helped analysts identify important records, relationships, and activity patterns quickly enough to inform the next day's project cycle.

Local fine-tuned model demo

Base model vs. personally fine-tuned local model

A local fine-tuning demo showing how the same prompt changes after a base model is adapted on a custom personal voice-and-domain dataset. I built the demo to make fine-tuning easy to understand: one side shows the generic base-model answer, while the other shows a more personal, domain-aware response style shaped by my own examples, Marine Corps experience, leadership background, and practical communication style.

To create the training data, I started with a small seed set of roughly 20 to 30 questions that I answered myself. I then expanded those examples into a larger training set using prompt variation, alternate phrasings, and related examples so the model could learn the style across many versions of similar questions.

The model work was run locally, using Apple MLX for local model experimentation and tuning on Apple Silicon. The goal was not to build a commercial model, but to create a simple, dramatic, easy-to-grasp demonstration of what fine-tuning does: moving a model from generic answers toward a more specific voice, tone, and domain perspective.

  • What it does: compares a generic local base model against a locally fine-tuned personal voice model using the same prompt.
  • Why I built it: to show nontechnical people what fine-tuning actually changes in a clear side-by-side demo.
  • Training approach: started with roughly 20 to 30 self-written Q&A examples, then expanded them into a larger training dataset through prompt variation, alternate phrasing, and related examples.
  • Model approach: used a local Apple Silicon workflow with Apple MLX for model experimentation, local tuning, and response comparison.
  • Outcome: made the difference between a general model and a tailored model easy to see. The base model gave more generic answers, while the tuned model responded with a more personal, domain-aware style.
  • Core idea: fine-tuning is easier to understand when people can compare two answers to the same question: one generic, one shaped by specific examples and communication style.
  • Privacy approach: the demo was run locally, showing how personal model experiments can be tested without relying on cloud-hosted responses for every prompt.
  • What I learned: model behavior is not only about facts. Tone, style, context, background, answer structure, and training examples can dramatically change how useful a response feels.
  • Status: local fine-tuning concept demo.
Local Model Runtime
Ollama llama3.2 Apple Silicon local runtime Local LLM execution Terminal-based model testing Offline-capable demo workflow
Fine-Tuning + Model Experimentation
Apple MLX Local model adaptation Personal voice fine-tuning Custom Q&A training data Training-data augmentation Prompt variation Same-prompt comparison testing
Dataset Creation
Self-written seed dataset 20 to 30 original Q&A examples Expanded prompt variations Alternate phrasing generation Related question generation Style and tone examples Domain-aware response examples
Optimization + Testing
Hyperparameter tuning experiments Bayesian optimization-style tuning Training run comparison Response quality review Before-and-after prompt testing Tone and relevance evaluation Consistency checks across related prompts
Model Comparison
Generic base model response Fine-tuned personal model response Side-by-side output review Base vs. tuned behavior comparison Usefulness and voice alignment review
Personalization Layer
Marine Corps leadership context Operations and decision-making style Practical communication tone Experience-grounded answer structure Audience-specific response shaping Personal voice and domain context
Privacy + Experimentation
Local-first AI testing Private prompt experimentation Local model workflow No cloud model required for demo comparison Personal style testing without public deployment
  1. Create the original Q&A examples: I started with roughly 20 to 30 questions and answered them myself to capture the tone, structure, background, and communication style I wanted the model to learn.
  2. Expand the dataset: the original examples were expanded into a larger training set using prompt variation, alternate phrasing, and related questions.
  3. Prepare the local training workflow: the training examples were prepared for a local Apple Silicon workflow using Apple MLX and local model tooling.
  4. Tune the model: the base model was adapted using the custom dataset so it could respond in a more personal, domain-aware style.
  5. Test the base model: the same prompt was first sent to the original local base model to capture the generic answer.
  6. Test the fine-tuned model: the same prompt was then sent to the tuned model to show how the response changed after training.
  7. Compare the outputs: the demo compared the two responses side by side, focusing on tone, specificity, relevance, structure, and usefulness.
  8. Refine the tuning: training settings and example coverage could be adjusted to improve consistency, voice alignment, and response quality across related prompts.

Bespoke applications

Smaller tools built for specific problems.

Beyond my larger projects, I have developed a range of bespoke applications that solve focused analysis, automation, privacy, and reporting problems.

Infrastructure Detection with YOLO

I fine-tuned a YOLO model to identify communication infrastructure using Google Earth imagery. By automating frame-by-frame analysis with Python and reaching around 79% recall and 87% precision, the project created a reliable tool for visual security assessments.

Wi-Fi OUI Analysis Tool

I developed a tool that analyzes wardriving data by identifying device manufacturers from MAC address OUIs. It automated a multi-day process into under a minute, giving instant insight into detected security cameras and network devices.

Unit and Currency Converter

To streamline large reports, I created a converter that changed metric measurements to imperial units and standardized currencies to USD. This reduced manual errors and turned time-consuming conversions into a seamless reporting step.

Selective AES Encryption

I built an encryption tool that supports targeted AES encryption of text, numbers, or entire files. It helps protect sensitive information when integrating AI, giving users more control over what data stays private.

AI robotics lab build

A small home robot concept for dog-toy cleanup.

This is a personal build with a simple reason behind it: it is cool, it is a good robotics challenge, and I do not want to keep picking up dog toys from around the house. The goal is to turn that everyday problem into a 2 to 3 foot experimental home robot that can recognize toys, decide what to collect, and return them to a basket.

The build brings together Lego robotics for the physical platform, a Raspberry Pi for onboard control, a YOLO-style computer vision model for toy detection, a Llama cloud model for higher-level task planning, and an older Mac acting as a local server for heavier coordination work.

Lego robotics Raspberry Pi YOLO computer vision LLM planning Llama cloud model Mac local server

Contact / resume / links

This portfolio evolves as new builds, prototypes, and experiments are completed.

I will keep adding project snapshots, stronger visuals, workflow notes, and practical lessons as each build moves from idea to working example.