Data Samples

Request samples built for your model workflows.

See how G2i turns real engineering tasks into high-signal data for evals, environments, post-training, and agent improvement.

From evals to agentic coding workflows, look under the hood at our data samples.

Every layer of the RLHF pipeline is carefully vetted and curated to produce consistent and quality data.

Get sample data

Multi-Step Code Reasoning

A dataset that prioritizes difficult and often long horizon engineering scenarios.

Sample includes:
  • 10 examples
  • Prompt, reasoning steps, debugging/explanation process, final implementation, and unit tests.
  • Prioritizes difficult engineering scenarios such as concurrency bugs, API integrations, TypeScript typing issues, SQL optimization, and performance debugging.

Preference Ranking

A dataset that evaluates and ranks candidate engineering responses.

Sample includes:
  • 10 examples
  • Two candidate responses, expert ranking, and engineering justification.
  • Evaluation criteria focused on correctness, maintainability, security, efficiency, and instruction adherence.

Agentic Coding Workflow Dataset

A dataset for evaluating long-horizon agentic coding workflows.

Sample Includes:
  • 3 end-to-end workflow examples
  • Examples that simulate realistic software engineering workflows including repo analysis, implementation planning, code modification, debugging, testing, and explanation.
  • Examples that demonstrate long-horizon reasoning and workflow management.

Evals & Benchmarks

A dataset that measures model performance with scored coding benchmarks.

Sample Includes:
  • 1-2 benchmark-style evaluations
  • Examples include prompt, gold-standard solution, scoring rubric, and common failure modes.

Adversarial, Edge Cases

A dataset that tests model robustness against ambiguous and failure-prone coding tasks.

Sample Includes:
  • Set of difficult edge-case examples
  • Examples of ambiguous requirements, broken tests, misleading prompts, incomplete specifications, and security-sensitive scenarios.
G2i data samples storefront

Review data samples

Schedule a call