Hubert (Marek) Pyskło
I'm a Computational Sciences & History student at Minerva University (graduating May 2027), studying across San Francisco, Seoul, Taipei, Berlin, Hyderabad, Buenos Aires, and Tokyo. I build infrastructure for evaluating and training AI agents.
Last summer I interned as a Research Engineer at Wordware (YC S24), where I built eval infrastructure for AI agents, benchmarked retrieval approaches for agent memory, and developed a filesystem-based approach that outperformed vector databases. Before that I spent 4 months at Samsung Heavy Industries in South Korea, building an air-gapped RAG system for shipbuilding document analysis. I also built Agent Diff, an open-source platform for RL training and evaluation on APIs like Slack or Linear (arXiv, HuggingFace), and trained a Qwen 30B LoRA adapter via GRPO that doubled benchmark score — matching SOTA model performance.
Before the AI work I co-founded Econverse, CEE's student startup incubator — 3,500+ students, $500K+ raised from Microsoft, Google, and Baker McKenzie. I'm on the board of AI Consensus, where we ran responsible AI hackathons across Asia — including one in Korea with students from 23 countries, sponsored by AWS, Perplexity, and Upstage.
I like 20th-century history (Deng's reforms, the Cold War), poker (founded Minerva Poker Club), skiing, and shooting.
projects
Agent Diff —
Research infrastructure for evaluating AI agents and RL training on replicas of 3rd-party APIs
(Slack, Linear, Box, Google Calendar). 108 endpoints, 224 benchmark tasks, deterministic
state-diff evaluation. Multi-tenant isolation via PostgreSQL schema-level sandboxing,
snapshot-based diff engine for validating multi-step agent behaviours.
Trained a LoRA adapter
via GRPO
on Slack + Linear tasks, improving eval scores from 0.31 to 0.59.
Pre-print on arXiv,
dataset on HuggingFace.
Python, PyTorch, PostgreSQL, SQLAlchemy, Starlette, TypeScript SDK
experience
| 2025 | Research Engineer Intern at Wordware (YC S24). Designed evaluation frameworks, scoring metrics, and test suites with LLM-as-judge verification. Built automated Q&A test generation from company internal data. Integrated into CI/CD pipeline. Benchmarked retrieval architectures (vector DB, SQL, graph, filesystem) for agent memory. |
| 2025 | AI Engineer Intern at Samsung Heavy Industries, South Korea. Built air-gapped RAG system for shipbuilding ITT document analysis — no internet access, no GPT, just local inference. 92.5% accuracy across 217 risk factors. |
| 2024– | Board Member at AI Consensus. Previously Lead for Asia, built Taiwan's largest student AI conference with the Ministry of Digital Affairs. Ran responsible AI hackathon in Korea with students from 23 countries - partners included AWS, Perplexity, and Upstage. |
| 2024 | Visiting Associate at s20 VC. Due diligence and deal sourcing across e-commerce, circular economy, and AI tools. |
| 2022–25 | Co-Founder & VP at Econverse. Built CEE's largest student startup incubator — 3,500+ students across 4 countries, $500K+ raised from Microsoft, ABB, and National Development Bank. Nationwide AI education campaign reaching 70,000+ students. |
| 2020–22 | Co-Founder at Token Studio. Crypto investment analytics — $50k+ angel round from execs at Getin Noble Bank and BNP Paribas. Didn't find PMF, shut down. |
education
| 2023–27 | Minerva University, San Francisco — B.Sc. Computational Sciences, Minor in History. |
| 2020–22 | IB World School No. 1349, Poznań — International Baccalaureate, 41/45. |
recognition
- RBF Scholar (2025 cohort) — full-ride scholarship for entrepreneurial achievement
- 2x Laureate of the National Economics Olympiad (top 0.3%)
- President's Award for Academic Achievements (Poznan)
media
- XYZ — interview on responsible AI and AI Consensus (2025)
- Forbes Poland — feature on Econverse (2024)
- Gazeta Prawna — podcast on Econverse and youth entrepreneurship (2023)