How Cloudflare’s tokio-quiche Makes QUIC and HTTP/3 a First Class Ci …

Cloudflare has open sourced tokio-quiche, an asynchronous QUIC and HTTP/3 Rust library that wraps its battle tested quiche implementation with the Tokio runtime. The library has been refined inside production systems such as Apple iCloud Private Relay, next generation Oxy based proxies and WARP’s MASQUE client, where it handles millions of HTTP/3 requests per second with low latency and high throughput. tokio-quiche targets Rust teams that want QUIC and HTTP/3 without writing their own UDP and event loop integration code.

From quiche to tokio-quiche

quiche is Cloudflare’s open source QUIC and HTTP/3 implementation written in Rust and designed as a low level, sans-io library. It implements the QUIC transport state machine, including connection establishment, flow control and stream multiplexing, while making no assumptions about how applications perform IO. To use quiche directly, integrators must open UDP sockets, send and receive datagrams, manage timers and feed all packet data into quiche in the correct order. This design gives flexibility, but it makes integration error prone and time consuming.

tokio-quiche packages this integration work into a reusable crate. It combines the sans-io QUIC or HTTP/3 implementation from quiche with the Tokio async runtime, and exposes an API that already manages UDP sockets, packet routing and calls into the quiche state machine.

Actor based architecture on Tokio

Internally, tokio-quiche uses an actor model on top of Tokio. Actors are small tasks with local state that communicate through message passing over channels, which aligns well with sans-io protocol implementations that own internal state and operate on message like buffers.

The primary actor is the IO loop actor, which moves packets between quiche and the UDP socket. One of the key message types is an Incoming struct that describes received UDP packets. Async integration follows a fixed pattern, the IO loop awaits new messages, translates them into inputs for quiche, advances the QUIC state machine, then translates outputs into outbound packets that are written back to the socket.

For each UDP socket, tokio-quiche spawns two important tasks. InboundPacketRouter owns the receiving half of the socket and routes inbound datagrams by destination connection ID to per connection channels. IoWorker is the per connection IO loop and drives a single quiche Connection, interleaving calls to quiche with calls to application specific logic implemented through ApplicationOverQuic. This design encapsulates connection state inside each actor and keeps QUIC processing isolated from higher level protocol code.

ApplicationOverQuic and H3Driver

QUIC is a transport protocol and can carry multiple application protocols. HTTP/3, DNS over QUIC and Media over QUIC are examples covered by IETF specifications. To avoid coupling tokio-quiche to a single protocol, Cloudflare team exposes an ApplicationOverQuic trait. The trait abstracts over quiche methods and the underlying IO, and presents higher level events and hooks to the application that implements the protocol. For example, the HTTP/3 debug and test client h3i uses a non HTTP/3 implementation of ApplicationOverQuic.

On top of this trait, tokio-quiche ships a dedicated HTTP/3 focused implementation named H3Driver. H3Driver connects quiche’s HTTP/3 module to the IO loop actor and converts raw HTTP/3 events into higher level events with asynchronous body streams that are convenient for application code. H3Driver is generic and exposes ServerH3Driver and ClientH3Driver variants that add server side and client side behavior on top of the core driver. These components provide the building blocks for HTTP/3 servers and clients that share implementation patterns with Cloudflare’s internal infrastructure.

Production usage and roadmap

tokio-quiche has been used for several years inside Cloudflare before its public release. It powers Proxy B in Apple iCloud Private Relay, Oxy based HTTP/3 servers and the WARP MASQUE client, as well as the async version of h3i. In the WARP client, MASQUE based tunnels built on tokio-quiche replace earlier WireGuard based tunnels with QUIC based tunnels. These systems run at Cloudflare edge scale and demonstrate that the integration can sustain millions of HTTP/3 requests per second in production.

Cloudflare positions tokio-quiche as a foundation rather than a complete HTTP/3 framework. The library exposes low level protocol capabilities and example client and server event loops, and leaves room for higher level projects to implement opinionated HTTP servers, DNS over QUIC clients, MASQUE based VPNs and other QUIC applications on top. By releasing the crate, Cloudflare aims to lower the barrier for Rust teams to adopt QUIC, HTTP/3 and MASQUE, and to align external integrations with the same transport stack used in its edge services.

Key Takeaways

tokio-quiche = quiche + Tokio: tokio-quiche is an async Rust library that integrates Cloudflare’s sans-io QUIC and HTTP/3 implementation, quiche, with the Tokio runtime, so developers do not need to hand write UDP and event loop plumbing.

Actor based architecture for QUIC connections: The library uses an actor model on Tokio, with an InboundPacketRouter that routes UDP datagrams by connection ID and an IoWorker that drives a single quiche Connection per task, keeping transport state isolated and composable.

ApplicationOverQuic abstraction: Protocol logic is separated through the ApplicationOverQuic trait, which abstracts over quiche and I O details so different QUIC based protocols such as HTTP/3, DNS over QUIC or custom protocols can be implemented on top of the same transport core.

HTTP/3 via H3Driver, ServerH3Driver and ClientH3Driver: tokio-quiche ships H3Driver plus ServerH3Driver and ClientH3Driver variants that bridge quiche’s HTTP/3 module to async Rust code, exposing HTTP/3 streams and bodies in a way that fits typical Tokio based services.

Check out the Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
The post How Cloudflare’s tokio-quiche Makes QUIC and HTTP/3 a First Class Citizen in Rust Backends appeared first on MarkTechPost.

How to Design Transactional Agentic AI Systems with LangGraph Using Tw …

In this tutorial, we implement an agentic AI pattern using LangGraph that treats reasoning and action as a transactional workflow rather than a single-shot decision. We model a two-phase commit system in which an agent stages reversible changes, validates strict invariants, pauses for human approval via graph interrupts, and commits or rolls back only then. With this, we demonstrate how agentic systems can be designed with safety, auditability, and controllability at their core, moving beyond reactive chat agents toward structured, governance-aware AI workflows that run reliably in Google Colab using OpenAI models. Check out the Full Codes here.

Copy CodeCopiedUse a different Browser!pip -q install -U langgraph langchain-openai

import os, json, uuid, copy, math, re, operator
from typing import Any, Dict, List, Optional
from typing_extensions import TypedDict, Annotated

from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, AnyMessage
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import interrupt, Command

def _set_env_openai():
if os.environ.get(“OPENAI_API_KEY”):
return
try:
from google.colab import userdata
k = userdata.get(“OPENAI_API_KEY”)
if k:
os.environ[“OPENAI_API_KEY”] = k
return
except Exception:
pass
import getpass
os.environ[“OPENAI_API_KEY”] = getpass.getpass(“Enter OPENAI_API_KEY: “)

_set_env_openai()

MODEL = os.environ.get(“OPENAI_MODEL”, “gpt-4o-mini”)
llm = ChatOpenAI(model=MODEL, temperature=0)

We set up the execution environment by installing LangGraph and initializing the OpenAI model. We securely load the API key and configure a deterministic LLM, ensuring that all downstream agent behavior remains reproducible and controlled. Check out the Full Codes here.

Copy CodeCopiedUse a different BrowserSAMPLE_LEDGER = [
{“txn_id”: “T001”, “name”: “Asha”, “email”: “ASHA@Example.com”, “amount”: “1,250.50”, “date”: “12/01/2025”, “note”: “Membership renewal”},
{“txn_id”: “T002”, “name”: “Ravi”, “email”: “ravi@example.com”, “amount”: “-500”, “date”: “2025-12-02”, “note”: “Chargeback?”},
{“txn_id”: “T003”, “name”: “Sara”, “email”: “sara@example.com”, “amount”: “700”, “date”: “02-12-2025”, “note”: “Late fee waived”},
{“txn_id”: “T003”, “name”: “Sara”, “email”: “sara@example.com”, “amount”: “700”, “date”: “02-12-2025”, “note”: “Duplicate row”},
{“txn_id”: “T004”, “name”: “Lee”, “email”: “lee@example.com”, “amount”: “NaN”, “date”: “2025/12/03”, “note”: “Bad amount”},
]

ALLOWED_OPS = {“replace”, “remove”, “add”}

def _parse_amount(x):
if isinstance(x, (int, float)):
return float(x)
if isinstance(x, str):
try:
return float(x.replace(“,”, “”))
except:
return None
return None

def _iso_date(d):
if not isinstance(d, str):
return None
d = d.replace(“/”, “-“)
p = d.split(“-“)
if len(p) == 3 and len(p[0]) == 4:
return d
if len(p) == 3 and len(p[2]) == 4:
return f”{p[2]}-{p[1]}-{p[0]}”
return None

def profile_ledger(rows):
seen, anomalies = {}, []
for i, r in enumerate(rows):
if _parse_amount(r.get(“amount”)) is None:
anomalies.append(i)
if r.get(“txn_id”) in seen:
anomalies.append(i)
seen[r.get(“txn_id”)] = i
return {“rows”: len(rows), “anomalies”: anomalies}

def apply_patch(rows, patch):
out = copy.deepcopy(rows)
for op in sorted([p for p in patch if p[“op”] == “remove”], key=lambda x: x[“idx”], reverse=True):
out.pop(op[“idx”])
for op in patch:
if op[“op”] in {“add”, “replace”}:
out[op[“idx”]][op[“field”]] = op[“value”]
return out

def validate(rows):
issues = []
for i, r in enumerate(rows):
if _parse_amount(r.get(“amount”)) is None:
issues.append(i)
if _iso_date(r.get(“date”)) is None:
issues.append(i)
return {“ok”: len(issues) == 0, “issues”: issues}

We define the core ledger abstraction along with the patching, normalization, and validation logic. We treat data transformations as reversible operations, allowing the agent to reason about changes safely before committing them. Check out the Full Codes here.

Copy CodeCopiedUse a different Browserclass TxnState(TypedDict):
messages: Annotated[List[AnyMessage], add_messages]
raw_rows: List[Dict[str, Any]]
sandbox_rows: List[Dict[str, Any]]
patch: List[Dict[str, Any]]
validation: Dict[str, Any]
approved: Optional[bool]

def node_profile(state):
p = profile_ledger(state[“raw_rows”])
return {“messages”: [AIMessage(content=json.dumps(p))]}

def node_patch(state):
sys = SystemMessage(content=”Return a JSON patch list fixing amounts, dates, emails, duplicates”)
usr = HumanMessage(content=json.dumps(state[“raw_rows”]))
r = llm.invoke([sys, usr])
patch = json.loads(re.search(r”[.*]”, r.content, re.S).group())
return {“patch”: patch, “messages”: [AIMessage(content=json.dumps(patch))]}

def node_apply(state):
return {“sandbox_rows”: apply_patch(state[“raw_rows”], state[“patch”])}

def node_validate(state):
v = validate(state[“sandbox_rows”])
return {“validation”: v, “messages”: [AIMessage(content=json.dumps(v))]}

def node_approve(state):
decision = interrupt({“validation”: state[“validation”]})
return {“approved”: decision == “approve”}

def node_commit(state):
return {“messages”: [AIMessage(content=”COMMITTED”)]}

def node_rollback(state):
return {“messages”: [AIMessage(content=”ROLLED BACK”)]}

We model the agent’s internal state and define each node in the LangGraph workflow. We express agent behavior as discrete, inspectable steps that transform state while preserving message history. Check out the Full Codes here.

Copy CodeCopiedUse a different Browserbuilder = StateGraph(TxnState)

builder.add_node(“profile”, node_profile)
builder.add_node(“patch”, node_patch)
builder.add_node(“apply”, node_apply)
builder.add_node(“validate”, node_validate)
builder.add_node(“approve”, node_approve)
builder.add_node(“commit”, node_commit)
builder.add_node(“rollback”, node_rollback)

builder.add_edge(START, “profile”)
builder.add_edge(“profile”, “patch”)
builder.add_edge(“patch”, “apply”)
builder.add_edge(“apply”, “validate”)

builder.add_conditional_edges(
“validate”,
lambda s: “approve” if s[“validation”][“ok”] else “rollback”,
{“approve”: “approve”, “rollback”: “rollback”}
)

builder.add_conditional_edges(
“approve”,
lambda s: “commit” if s[“approved”] else “rollback”,
{“commit”: “commit”, “rollback”: “rollback”}
)

builder.add_edge(“commit”, END)
builder.add_edge(“rollback”, END)

app = builder.compile(checkpointer=InMemorySaver())

We construct the LangGraph state machine and explicitly encode the control flow between profiling, patching, validation, approval, and finalization. We use conditional edges to enforce governance rules rather than rely on implicit model decisions. Check out the Full Codes here.

Copy CodeCopiedUse a different Browserdef run():
state = {
“messages”: [],
“raw_rows”: SAMPLE_LEDGER,
“sandbox_rows”: [],
“patch”: [],
“validation”: {},
“approved”: None,
}

cfg = {“configurable”: {“thread_id”: “txn-demo”}}
out = app.invoke(state, config=cfg)

if “__interrupt__” in out:
print(json.dumps(out[“__interrupt__”], indent=2))
decision = input(“approve / reject: “).strip()
out = app.invoke(Command(resume=decision), config=cfg)

print(out[“messages”][-1].content)

run()

We run the transactional agent and handle human-in-the-loop approval through graph interrupts. We resume execution deterministically, demonstrating how agentic workflows can pause, accept external input, and safely conclude with either a commit or rollback.

In conclusion, we showed how LangGraph enables us to build agents that reason over states, enforce validation gates, and collaborate with humans at precisely defined control points. We treated the agent not as an oracle, but as a transaction coordinator that can stage, inspect, and reverse its own actions while maintaining a full audit trail. This approach highlights how agentic AI can be applied to real-world systems that require trust, compliance, and recoverability, and it provides a practical foundation for building production-grade autonomous workflows that remain safe, transparent, and human-supervised.

Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
The post How to Design Transactional Agentic AI Systems with LangGraph Using Two-Phase Commit, Human Interrupts, and Safe Rollbacks appeared first on MarkTechPost.

Tencent Released Tencent HY-Motion 1.0: A Billion-Parameter Text-to-Mo …

Tencent Hunyuan’s 3D Digital Human team has released HY-Motion 1.0, an open weight text-to-3D human motion generation family that scales Diffusion Transformer based Flow Matching to 1B parameters in the motion domain. The models turn natural language prompts plus an expected duration into 3D human motion clips on a unified SMPL-H skeleton and are available on GitHub and Hugging Face with code, checkpoints and a Gradio interface for local use.

https://arxiv.org/pdf/2512.23464

What HY-Motion 1.0 provides for developers?

HY-Motion 1.0 is a series of text-to-3D human motion generation models built on a Diffusion Transformer, DiT, trained with a Flow Matching objective. The model series showcases 2 variants, HY-Motion-1.0 with 1.0B parameters as the standard model and HY-Motion-1.0-Lite with 0.46B parameters as a lightweight option.

Both models generate skeleton based 3D character animations from simple text prompts. The output is a motion sequence on an SMPL-H skeleton that can be integrated into 3D animation or game pipelines, for example for digital humans, cinematics and interactive characters. The release includes inference scripts, a batch oriented CLI and a Gradio web app, and supports macOS, Windows and Linux.

Data engine and taxonomy

The training data comes from 3 sources, in the wild human motion videos, motion capture data and 3D animation assets for game production. The research team starts from 12M high quality video clips from HunyuanVideo, runs shot boundary detection to split scenes and a human detector to keep clips with people, then applies the GVHMR algorithm to reconstruct SMPL X motion tracks. Motion capture sessions and 3D animation libraries contribute about 500 hours of additional motion sequences.

All data is retargeted onto a unified SMPL-H skeleton through mesh fitting and retargeting tools. A multi stage filter removes duplicate clips, abnormal poses, outliers in joint velocity, anomalous displacements, long static segments and artifacts such as foot sliding. Motions are then canonicalized, resampled to 30 fps and segmented into clips shorter than 12 seconds with a fixed world frame, Y axis up and the character facing the positive Z axis. The final corpus contains over 3,000 hours of motion, of which 400 hours are high quality 3D motion with verified captions.

On top of this, the research team defines a 3 level taxonomy. At the top level there are 6 classes, Locomotion, Sports and Athletics, Fitness and Outdoor Activities, Daily Activities, Social Interactions and Leisure and Game Character Actions. These expand into more than 200 fine grained motion categories at the leaves, which cover both simple atomic actions and concurrent or sequential motion combinations.

Motion representation and HY-Motion DiT

HY-Motion 1.0 uses the SMPL-H skeleton with 22 body joints without hands. Each frame is a 201 dimensional vector that concatenates global root translation in 3D space, global body orientation in a continuous 6D rotation representation, 21 local joint rotations in 6D form and 22 local joint positions in 3D coordinates. Velocities and foot contact labels are removed because they slowed training and did not help final quality. This representation is compatible with animation workflows and close to the DART model representation.

The core network is a hybrid HY Motion DiT. It first applies dual stream blocks that process motion latents and text tokens separately. In these blocks, each modality has its own QKV projections and MLP, and a joint attention module allows motion tokens to query semantic features from text tokens while keeping modality specific structure. The network then switches to single stream blocks that concatenate motion and text tokens into one sequence and process them with parallel spatial and channel attention modules to perform deeper multimodal fusion.

For text conditioning, the system uses a dual encoder scheme. Qwen3 8B provides token level embeddings, while a CLIP-L model provides global text features. A Bidirectional Token Refiner fixes the causal attention bias of the LLM for non autoregressive generation. These signals feed the DiT through adaptive layer normalization conditioning. Attention is asymmetric, motion tokens can attend to all text tokens, but text tokens do not attend back to motion, which prevents noisy motion states from corrupting the language representation. Temporal attention inside the motion branch uses a narrow sliding window of 121 frames, which focuses capacity on local kinematics while keeping cost manageable for long clips. Full Rotary Position Embedding is applied after concatenating text and motion tokens to encode relative positions across the whole sequence.

Flow Matching, prompt rewriting and training

HY-Motion 1.0 uses Flow Matching instead of standard denoising diffusion. The model learns a velocity field along a continuous path that interpolates between Gaussian noise and real motion data. During training, the objective is a mean squared error between predicted and ground truth velocities along this path. During inference, the learned ordinary differential equation is integrated from noise to a clean trajectory, which gives stable training for long sequences and fits the DiT architecture.

A separate Duration Prediction and Prompt Rewrite module improves instruction following. It uses Qwen3 30B A3B as the base model and is trained on synthetic user style prompts generated from motion captions with a VLM and LLM pipeline, for example Gemini 2.5 Pro. This module predicts a suitable motion duration and rewrites informal prompts into normalized text that is easier for the DiT to follow. It is trained first with supervised fine tuning and then refined with Group Relative Policy Optimization, using Qwen3 235B A22B as a reward model that scores semantic consistency and duration plausibility.

Training follows a 3 stage curriculum. Stage 1 performs large scale pretraining on the full 3,000 hour dataset to learn a broad motion prior and basic text motion alignment. Stage 2 fine tunes on the 400 hour high quality set to sharpen motion detail and improve semantic correctness with a smaller learning rate. Stage 3 applies reinforcement learning, first Direct Preference Optimization using 9,228 curated human preference pairs sampled from about 40,000 generated pairs, then Flow GRPO with a composite reward. The reward combines a semantic score from a Text Motion Retrieval model and a physics score that penalizes artifacts like foot sliding and root drift, under a KL regularization term to stay close to the supervised model.

Benchmarks, scaling behavior and limitations

For evaluation, the team builds a test set of over 2,000 prompts that span the 6 taxonomy categories and include simple, concurrent and sequential actions. Human raters score instruction following and motion quality on a scale from 1 to 5. HY-Motion 1.0 reaches an average instruction following score of 3.24 and an SSAE score of 78.6 percent. Baseline text-to-motion systems such as DART, LoM, GoToZero and MoMask achieve scores between 2.17 and 2.31 with SSAE between 42.7 percent and 58.0 percent. For motion quality, HY-Motion 1.0 reaches 3.43 on average versus 3.11 for the best baseline.

Scaling experiments study DiT models with 0.05B, 0.46B, 0.46B trained only on 400 hours and 1B parameters. Instruction following improves steadily with model size, with the 1B model reaching an average of 3.34. Motion quality saturates around the 0.46B scale, where the 0.46B and 1B models reach similar averages between 3.26 and 3.34. Comparison of the 0.46B model trained on 3,000 hours and the 0.46B model trained only on 400 hours shows that larger data volume is key for instruction alignment, while high quality curation mainly improves realism.

Key Takeaways

Billion scale DiT Flow Matching for motion: HY-Motion 1.0 is the first Diffusion Transformer based Flow Matching model scaled to the 1B parameter level specifically for text to 3D human motion, targeting high fidelity instruction following across diverse actions.

Large scale, curated motion corpus: The model is pretrained on over 3,000 hours of reconstructed, mocap and animation motion data and fine tuned on a 400 hour high quality subset, all retargeted to a unified SMPL H skeleton and organized into more than 200 motion categories.

Hybrid DiT architecture with strong text conditioning: HY-Motion 1.0 uses a hybrid dual stream and single stream DiT with asymmetric attention, narrow band temporal attention and dual text encoders, Qwen3 8B and CLIP L, to fuse token level and global semantics into motion trajectories.

RL aligned prompt rewrite and training pipeline: A dedicated Qwen3 30B based module predicts motion duration and rewrites user prompts, and the DiT is further aligned with Direct Preference Optimization and Flow GRPO using semantic and physics rewards, which improves realism and instruction following beyond supervised training.

Check out the Paper and Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.
The post Tencent Released Tencent HY-Motion 1.0: A Billion-Parameter Text-to-Motion Model Built on the Diffusion Transformer (DiT) Architecture and Flow Matching appeared first on MarkTechPost.