Singapore-based AI engineer | AI platforms | evaluation systems

Wes Lee

Builds backend-first AI platforms, multimodal evaluation systems, and operator-facing decision tools.

Currently AI Engineer and ASEAN Education Program Director at Elice. This site focuses on inspectable systems and selected private-code case studies across retrieval, evaluation, finance, forecasting, and decision support.

View selected work About me

AI platforms
Evaluation systems
Retrieval and orchestration
Operator-facing products

Flagship systems Supporting depth

Selected flagship systems across finance, retrieval, evaluation, and document intelligence — **Four flagships plus three supporting systems** Case studies across platform architecture, retrieval, evaluation, finance, and operator-facing decision products.

4 flagship systems for platform, evaluation, and product execution

3 supporting systems across workforce intelligence, DSPy, and enterprise RAG

18 articles that explain the implementation and evaluation tradeoffs

Flagship case studies

Start with four flagship case studies that best represent the portfolio.

This set is intentionally small: the clearest signal for platform architecture, evaluation rigor, and operator-facing product execution.

Edtech / AI platform

Creator AI Platform

A backend-first orchestration platform for learning-asset generation with staged discovery, retrieval, validation, and human review built into the product itself.

Workflow orchestration and quality gates
Discovery, retrieval, generation, and review stages
Private codebase with public case study

Project page Article Private repo

Document intelligence / service architecture

Intelligent Content Analyzer

A modular document intelligence platform that splits upload, retrieval, generation, and evaluation into services rather than one monolithic demo app.

Hybrid retrieval and confidence checks
FastAPI gateway plus dedicated services
Multilingual workflow support

Project page Article Live demo

Evaluation systems / multimodal

ArtifactBench AI Evaluation Workbench

A multimodal evaluation workbench for AI-generated artifacts, decks, PDFs, and code bundles, combining deterministic metrics, retrieval, provenance checks, and judge-assisted scoring.

Deterministic metrics plus retrieval-assisted evaluation
Provenance checks and judge-assisted scoring
Private codebase with public case study

Project page Article Private repo

Finance / foundation models

AI Portfolio Advisory System

A robo-advisor platform that applies TabPFN to investor risk assessment and pairs it with objective-aware portfolio optimization.

TabPFN risk profiling
Dynamic portfolio objectives
Interactive demo available

Project page Article Live demo

Current work

Current work centers on orchestration, evaluation, and operator review.

AI platforms

Backend-first systems that connect retrieval, generation, validation, and human-in-the-loop review.

Evaluation and observability

Benchmarking, evidence checks, regression coverage, tracing, and structured scoring for model behavior.

Product interfaces

FastAPI, Streamlit, and operator-facing interface patterns that give users usable surfaces instead of opaque models.

Decision domains

Education, workforce intelligence, finance, public health, and business analytics where systems need to drive action.

Recent writing

Writing behind the systems.

Technical notes on implementation choices, evaluation logic, and what changed between experiment and usable product.

May 2, 2026

Building Longevity Lab for Health Scenario Modeling

Why health-risk interfaces need visible data provenance, model cards, runtime mode, and causal-analysis boundaries.

Read article

Mar 24, 2026

Designing Creator AI as a Backend-First Platform

Why AI content generation becomes more valuable when orchestration, discovery, validation, and review are part of the product.

Read article

Mar 24, 2026

Building ArtifactBench for AI-Generated Artifacts

How deterministic metrics, benchmark retrieval, provenance checks, and judge-assisted scoring support broader artifact evaluation.

Read article

Next move

Need an engineer who can connect model quality with product execution?

Best fit is backend-first AI platforms, evaluation-heavy systems, and decision-support products moving beyond prototype stage.

Start a conversation Browse GitHub

Wes Lee

Start with four flagship case studies that best represent the portfolio.

Creator AI Platform

Intelligent Content Analyzer

ArtifactBench AI Evaluation Workbench

AI Portfolio Advisory System

Current work centers on orchestration, evaluation, and operator review.

AI platforms

Evaluation and observability

Product interfaces

Decision domains

Writing behind the systems.

Building Longevity Lab for Health Scenario Modeling

Designing Creator AI as a Backend-First Platform

Building ArtifactBench for AI-Generated Artifacts

Need an engineer who can connect model quality with product execution?

Templates:

Error