Artificial Intelligence

Agentic Context Engineering

alt ACE

Large language model (LLM) applications such as agents and domain-specific reasoning increasingly rely on context adaptation—modifying inputs with instructions, strategies, or evidence, rather than weight updates. Prior approaches improve usability but often suffer from brevity bias, which drops domain insights for concise summaries, and from context collapse, where iterative rewriting erodes details over time. Building on the adaptive memory introduced by Dynamic Cheatsheet, we introduce ACE (Agentic Context Engineering), a framework that treats contexts as evolving playbooks that accumulate, refine, and organize strategies through a modular process of generation, reflection, and curation. ACE prevents collapse with structured, incremental updates that preserve detailed knowledge and scale with long-context models. Across agent and domain-specific benchmarks, ACE optimizes contexts both offline (e.g., system prompts) and online (e.g., agent memory), consistently outperforming strong baselines: +10.6% on agents and +8.6% on finance, while significantly reducing adaptation latency and rollout cost. Notably, ACE could adapt effectively without labeled supervision and instead by leveraging natural execution feedback. On the AppWorld leaderboard, ACE matches the top-ranked production-level agent on the overall average and surpasses it on the harder test-challenge split, despite using a smaller open-source model. These results show that comprehensive, evolving contexts enable scalable, efficient, and self-improving LLM systems with low overhead

Shannon Autonomous Pentester

Shannon adalah perangkat lunak pentester berbasis Artificial Intelligence.

Fitur utama:

  • Beroperasi secara otonom, tidak perlu manual
  • Laporan pentester dengan exploit yang dapat direproduksi
  • Critical OWASP Vulnerability Coverage
  • Code-Aware Dynamic Testing
  • Powered by Integrated Security Tools
  • Parallel Processing for Faster Results

alt Shannon Screen

Sumber: #

Professional Software Developers Don't Vibe, They Control: AI Agent Use for Coding in 2025

The rise of AI agents is transforming how software can be built. The promise of agents is that developers might write code quicker, delegate multiple tasks to different agents, and even write a full piece of software purely out of natural language. In reality, what roles agents play in professional software development remains in question. This paper investigates how experienced developers use agents in building software, including their motivations, strategies, task suitability, and sentiments. Through field observations (N=13) and qualitative surveys (N=99), we find that while experienced developers value agents as a productivity boost, they retain their agency in software design and implementation out of insistence on fundamental software quality attributes, employing strategies for controlling agent behavior leveraging their expertise. In addition, experienced developers feel overall positive about incorporating agents into software development given their confidence in complementing the agents’ limitations. Our results shed light on the value of software development best practices in effective use of agents, suggest the kinds of tasks for which agents may be suitable, and point towards future opportunities for better agentic interfaces and agentic use guidelines.

Geospatial Segmentation

Open LLM

Beberapa model AI LLM open:

  • Writing: Kimi k2 / Thinking
  • Coding: Minimax M2 / GLM 4.6
  • OCR: DeepSeek / Qwen 3 VL
  • General: DeepSeek V3.2
  • Image: Flux 2 Dev / Z-Image
  • Reasoning : DeepSeek v3.2 speciale

Metode akses:

  • Writing : Official Chat UI Kimi
  • Coding : Zed / KiloCode
  • OCR : Official Chat / via API
  • General : Official Chat Deepsek
  • Reasoning : Official / via API
  • Image Editing dan Image : hugging face, ModelScope

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

alt stanford prompting technique

Stanford researchers built a new prompting technique!

By adding ~20 words to a prompt, it:

  • boosts LLM’s creativity by 1.6-2x
  • raises human-rated diversity by 25.7%
  • beats fine-tuned model without any retraining
  • restores 66.8% of LLM’s lost creativity after alignment

Post-training alignment methods, such as RLHF, are designed to make LLMs helpful and safe.

However, these methods unintentionally cause a significant drop in output diversity (called mode collapse).

Agents, robots, and us: Skill partnerships in the age of AI

AI is expanding the productivity frontier. Realizing its benefits requires new skills and rethinking how people work together with intelligent machines.

At a glance #

  • Work in the future will be a partnership between people, agents, and robots—all powered by AI. Today’s technologies could theoretically automate more than half of current US work hours. This reflects how profoundly work may change, but it is not a forecast of job losses. Adoption will take time. As it unfolds, some roles will shrink, others grow or shift, while new ones emerge—with work increasingly centered on collaboration between humans and intelligent machines.
  • Most human skills will endure, though they will be applied differently. More than 70 percent of the skills sought by employers today are used in both automatable and non-automatable work. This overlap means most skills remain relevant, but how and where they are used will evolve.
  • Our new Skill Change Index shows which skills will be most and least exposed to automation in the next five years. Digital and information-processing skills could be most affected; those related to assisting and caring are likely to change the least.
  • Demand for AI fluency—the ability to use and manage AI tools—has grown sevenfold in two years, faster than for any other skill in US job postings. The surge is visible across industries and likely marks the beginning of much bigger changes ahead.
  • By 2030, about $2.9 trillion of economic value could be unlocked in the United States—if organizations prepare their people and redesign workflows, rather than individual tasks, around people, agents, and robots working together.

alt our skill change index assesses how automation exposure varies across skills