Sorted by Tags

AI

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Agents

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Annotation

CUDA

Chatbots

Constrained Decoding

DVC

Data Science

Data Versioning

Data-Augmentation

Data-Science

Dataset creation

Deep Learning

Engineering

Environments

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Evaluation

GPT

GPT3

GPT4

GPU

GRPO

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

HuggingFace

ICV

In-Context Vectors

Inference

Jupyter Notebook

LLM

LMs

Language Models

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

LoRA

ML

Machine Learning

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Model Serialization

Model management

NLP

Notebook

OpenAI

Optimization

PDF

PDF parsing

Performance

Pickle

Product Development

Prototyping

PyTorch

Python

QLoRA

RL

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Reading PDFs

Reinforcement Learning

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Sandbox

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Security

Shorts

Testing

Tooling

Tools

Training

An Overview of RL Environments

20 minute read

Everything that happens in an RL environment between the policy update and the next rollout - verification, reward shaping, tool calling, curriculum design, ...

Writing to PDF

agents

automation

data science

finetuning

foma

hidden states

huggingface

iPython Notebook

interviews

nlp

safetensors

tooling

transformers

ui

vision