Results for

Duncan Carlsmith
Last activity about 8 hours ago

Web Automation with Claude, MATLAB, Chromium, and Playwright

Duncan Carlsmith, University of Wisconsin-Madison

Introduction

Recent agentic browsers (Chrome with Claude Chrome extension and Comet by Perplexity) are marvelous but limited. This post describes two things: first, a personal agentic browser system that outperforms commercial AI browsers for complex tasks; and second, how to turn AI-discovered web workflows into free, deterministic MATLAB scripts that run without AI.

My setup is a MacBook Pro with the Claude Desktop app, MATLAB 2025b, and Chromium open-source browser. Relevant MCP servers include fetch, filesystem, MATLAB, and Playwright, with shell access via MATLAB or shell MCP. Rather than use my Desktop Chrome application, which might expose personal information, I use an independent, dedicated Chromium with a persistent login and preauthentication for protected websites. Rather than screenshots, which quickly saturate a chat context and are expensive, I use the Playwright MCP server, which accesses the browser DOM and accessibility tree directly. DOM manipulation permits error-free operation of complex web page UIs.

The toolchain required is straightforward. You need Node.js , which is the JavaScript runtime that executes Playwright scripts outside a browser. Install it, then set up a working directory and install Playwright with its bundled Chromium:

# Install Node.js via Homebrew (macOS) or download from nodejs.org

brew install node

# Create a working directory and install Playwright

mkdir MATLABWithPlaywright && cd MATLABWithPlaywright

npm init -y

npm install playwright

# Download Playwright's bundled Chromium (required for Tier 1)

npx playwright install chromium

That is sufficient for the Tier 1 examples. For Tier 2 (authenticated automation), you also need Google Chrome or the open-source Chromium browser, launched with remote debugging enabled as described below. Playwright itself is an open-source browser automation library from Microsoft that can either launch its own bundled browser or connect to an existing one -- this dual capability is the foundation of the two-tier architecture. For the AI-agentic work described in the Canvas section, you need Claude Desktop with MCP servers configured for filesystem access, MATLAB, and Playwright. The INSTALL.md in the accompanying FEX submission covers all of this in detail.

AI Browser on Steroids: Building Canvas Quizzes

An agentic browser example just completed illustrates the power of this approach. I am adding a computational thread to a Canvas LMS course in modern physics based on relevant interactive Live Scripts I have posted to the MATLAB File Exchange. For each of about 40 such Live Scripts, I wanted to build a Canvas quiz containing an introduction followed by a few multiple-choice questions and a few file-upload questions based on the "Try this" interactive suggestions (typically slider parameter adjustments) and "Challenges" (typically to extend the code to achieve some goal). The Canvas interface for quiz building is quite complex, especially since I use a lot of LaTeX, which in the LMS is rendered using MathJax with accessibility features and only a certain flavor of encoding works such that the math is rendered both in the quiz editor and when the quiz is displayed to a student.

My first prompt was essentially "Find all of my FEX submissions and categorize those relevant to modern physics.” The categories emerged as Relativity, Quantum Mechanics, Atomic Physics, and Astronomy and Astrophysics. Having preauthenticated at MathWorks with a Shibboleth university license authentication system, the next prompt was "Download and unzip the first submission in the relativity category, read the PDF of the executed script or view it under examples at FEX, then create quiz questions and answers as described above." The final prompt was essentially "Create a new quiz in my Canvas course in the Computation category with a due date at the end of the semester. Include the image and introduction from the FEX splash page and a link to FEX in the quiz instructions. Add the MC quiz questions with 4 answers each to select from, and the file upload questions. Record what you learned in a SKILL file in my MATLAB/claude/SKILLS folder on my filesystem." Claude offered a few options, and we chose to write and upload the quiz HTML from scratch via the Canvas REST API. Done. Finally, "Repeat for the other FEX File submissions." Each took a couple of minutes. The hard part was figuring out what I wanted to do exactly.

Mind you, I had tried to build a Canvas quiz including LaTeX and failed miserably with both Chrome Extension and Comet. The UI manipulations, especially to handle the LaTeX, were too complex, and often these agentic browsers would click in the wrong place, wind up on a different page, even in another tab, and potentially become destructive.

A key gotcha with LaTeX in Canvas: the equation rendering system uses double URL encoding for LaTeX expressions embedded as image tags pointing to the Canvas equation server. The LaTeX strings must use single backslashes -- double backslashes produce broken output. And Canvas Classic Quizzes and New Quizzes handle MathJax differently, so you need to know which flavor your institution uses.

From AI-Assisted to Programmatic: The Two-Tier Architecture

An agentic-AI process, like the quiz creation, can become expensive. There is a lot of context, both physics content-related and process-related, and the token load mounts up in a chat. Wouldn't it be great if, after having used the AI for what it is best at -- summarizing material, designing student exercises, and discovering a web-automation process -- one could repeat the web-related steps programmatically for free with MATLAB? Indeed, it would, and is.

In my setup, usually an AI uses MATLAB MCP to operate MATLAB as a tool to assist with, say, launching an application like Chromium or to preprocess an image. But MATLAB can also launch any browser and operate it via Playwright. (To my knowledge, MATLAB can use its own browser to view a URL but not to manipulate it.) So the following workflow emerges:

1) Use an AI, perhaps by recording the DOM steps in a manual (human) manipulation, to discover a web-automation process.

2) Use the AI to write and debug MATLAB code to perform the process repeatedly, automatically, for free.

I call this "temperature zero" automation -- the AI contributes entropy during workflow discovery, then the deterministic script is the ground state.

The architecture has three layers:

MATLAB function (.m)

Generate JavaScript/Playwright code

Write to temporary .js file

Execute: system('node script.js')

Parse output (JSON file or console)

Return structured result to MATLAB

The .js files serve double duty: they are both the runtime artifacts that MATLAB generates and executes, AND readable documentation of the exact DOM interactions Playwright performs. Someone who wants to adapt this for their own workflow can read the .js file and see every getByRole, fill, press, and click in sequence.

Tier 1: Basic Web Automation Examples

I have demonstrated this concept with three basic examples, each consisting of a MATLAB function (.m) that dynamically generates and executes a Playwright script (.js). These use Playwright's bundled Chromium in headless mode -- no authentication required, no persistent sessions.

01_ExtractTableData

extractTableData.m takes a URL and scrapes a complex Wikipedia table (List of Nearest Stars) that MATLAB's built-in webread cannot handle because the table is rendered by JavaScript. The function generates extract_table.js, which launches Playwright's bundled Chromium headlessly, waits for the full DOM to render, walks through the table rows extracting cell text, and writes the result as JSON. Back in MATLAB, the JSON is parsed and cleaned (stripping HTML tags, citation brackets, and Unicode symbols) into a standard MATLAB table.

T = extractTableData(...

'https://en.wikipedia.org/wiki/List_of_nearest_stars_and_brown_dwarfs');

disp(T(1:5, {'Star_name', 'Distance_ly_', 'Stellar_class'}))

histogram(str2double(T.Distance_ly_), 20)

xlabel('Distance (ly)'); ylabel('Count'); title('Nearest Stars')

02_ScreenshotWebpage

screenshotWebpage.m captures screenshots at configurable viewport dimensions (desktop, tablet, mobile) with full-page or viewport-only options. The physics-relevant example captures the NASA Webb Telescope page at multiple viewport sizes. This is genuinely useful for checking how your own FEX submission pages or course sites look on different devices.

03_DownloadFile

downloadFile.m is the most complex Tier 1 function because it handles two fundamentally different download mechanisms. Direct-link downloads (where navigating to the URL triggers the download immediately) throw a "Download is starting" error that is actually success:

try {

await page.goto(url, { waitUntil: 'commit' });

} catch (e) {

// Ignore "Download is starting" -- that means it WORKED!

if (!e.message.includes('Download is starting')) throw e;

}

Button-click downloads (like File Exchange) require finding and clicking a download button after page load. The critical gotcha: the download event listener must be set up BEFORE navigation, not after. Getting this ordering wrong was one of those roadblocks that cost real debugging time.

The function also supports a WaitForLogin option that pauses automation for 45 seconds to allow manual authentication -- a bridge to Tier 2's persistent-session approach.

Another lesson learned: don't use Playwright for direct CSV or JSON URLs. MATLAB's built-in websave is simpler and faster for those. Reserve Playwright for files that require JavaScript rendering, button clicks, or authentication.

Tier 2: Production Automation with Persistent Sessions

Tier 2 represents the key innovation -- the transition from "AI does the work" to "AI writes the code, MATLAB does the work." The critical architectural difference from Tier 1 is a single line of JavaScript:

// Tier 1: Fresh anonymous browser

const browser = await chromium.launch();

// Tier 2: Connect to YOUR running, authenticated Chrome

const browser = await chromium.connectOverCDP('http://localhost:9222');

CDP is the Chrome DevTools Protocol -- the same WebSocket-based interface that Chrome's built-in developer tools use internally. When you launch Chrome with a debugging port open, any external program can connect over CDP to navigate pages, inspect and manipulate the DOM, execute JavaScript, and intercept network traffic. The reason this matters is that Playwright connects to your already-running, already-authenticated Chrome session rather than launching a fresh anonymous browser. Your cookies, login sessions, and saved credentials are all available. You launch Chrome once with remote debugging enabled:

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \

--remote-debugging-port=9222 \

--user-data-dir="$HOME/chrome-automation-profile"

Log into whatever sites you need. Those sessions persist across automation runs.

addFEXTagLive.m

This is the workhorse function. It uses MATLAB's modern arguments block for input validation and does the following: (1) verifies the CDP connection to Chrome is alive with a curl check, (2) dynamically generates a complete Playwright script with embedded conditional logic -- check if tag already exists (skip if so), otherwise click "New Version", add the tag, increment the version number, add update notes, click Publish, confirm the license dialog, and verify the success message, (3) executes the script asynchronously and polls for a result JSON file, and (4) returns a structured result with action taken, version changes, and optional before/after screenshots.

result = addFEXTagLive( ...

'https://www.mathworks.com/matlabcentral/fileexchange/183228-...', ...

'interactive_examples', Screenshots=true);

% result.action is either 'skipped' or 'added_tag'

% result.oldVersion / result.newVersion show version bump

% result.screenshots.beforeImage / afterImage for display

The corresponding add_fex_tag_production.js is a standalone Node.js version that accepts command-line arguments:

node add_fex_tag_production.js 182704 interactive-script 0.01 "Added tag"

This is useful for readers who want to see the pure JavaScript logic without the MATLAB generation layer.

batch_tag_FEX_files.m

The batch controller reads a text file of URLs, loops through them calling addFEXTagLive with rate limiting (10 seconds between submissions), tracks success/skip/fail counts, and writes three output files: successful_submissions.txt, skipped_submissions.txt, and failed_submissions_to_retry.txt.

This script processed all 178 of my FEX submissions:

Total: 178 submissions processed in 2h 11m (~44 sec/submission)

Tags added: 146 (82%) | Already tagged: 32 (18%) | True failures: 0

Manual equivalent: ~7.5 hours | Token cost after initial engineering: $0

The Timeout Gotcha

An interesting gotcha emerged during the batch run. Nine submissions were reported as failures with timeout errors. The error message read:

page.textContent: Timeout 30000ms exceeded.

Call log: - waiting for locator('body')

Investigation revealed these were false negatives. The timeout occurred in the verification phase -- Playwright had successfully added the tag and clicked Publish, but the MathWorks server was slow to reload the confirmation page (>30 seconds). The tag was already saved. When a retry script ran, all nine immediately reported "Tag already exists -- SKIPPING." True success rate: 100%.

Could this have been fixed with a longer timeout or a different verification strategy? Sure. But I mention it because in a long batch process (2+ hours, 178 submissions), gotchas emerge intermittently that you never see in testing on five items. The verification-timeout pattern is a good one to watch for: your automation succeeded, but your success check failed.

Key Gotchas and Lessons Learned

A few more roadblocks worth flagging for anyone attempting this:

waitUntil options matter. Playwright's networkidle wait strategy almost never works on modern sites because analytics scripts keep firing. Use load or domcontentloaded instead. For direct downloads, use commit.

Quote escaping in MATLAB-generated JavaScript. When MATLAB's sprintf generates JavaScript containing CSS selectors with double quotes, things break. Using backticks as JavaScript template literal delimiters avoids the conflict.

The FEX license confirmation popup is accessible to Playwright as a standard DOM dialog, not a browser popup. No special handling needed, but the Publish button appears twice -- once to initiate and once to confirm -- requiring exact: true in the role selector to distinguish them:

// First Publish (has a space/icon prefix)

await page.getByRole('button', { name: ' Publish' }).click();

// Confirm Publish (exact match)

await page.getByRole('button', { name: 'Publish', exact: true }).click();

File creation from Claude's container vs. your filesystem. This caused real confusion early on. Claude's default file creation tools write to a container that MATLAB cannot see. Files must be created using MATLAB's own file operations (fopen/fprintf/fclose) or the filesystem MCP's write_file tool to land on your actual disk.

Selector strategy. Prefer getByRole (accessibility-based, most stable) over CSS selectors or XPath. The accessibility tree is what Playwright MCP uses natively, and role-based selectors survive minor UI changes that would break CSS paths.

Two Modes of Working

Looking back, the Canvas quiz creation and the FEX batch tagging represent two complementary modes of working with this architecture:

The Canvas work keeps AI in the loop because each quiz requires different physics content -- the AI reads the Live Script, understands the physics, designs questions, and crafts LaTeX. The web automation (posting to Canvas via its REST API) is incidental. This is AI-in-the-loop for content-dependent work.

The FEX tagging removes AI from the loop because the task is structurally identical across 178 submissions -- navigate, check, conditionally update, publish. The AI contributed once to discover and encode the workflow. This is AI-out-of-the-loop for repetitive structural work.

Both use the same underlying architecture: MATLAB + Playwright + Chromium + CDP. The difference is whether the AI is generating fresh content or executing a frozen script.

Reference Files and FEX Submission

All of the Tier 1 and Tier 2 MATLAB functions, JavaScript templates, example scripts, installation guide, and skill documentation described in this post are available as a File Exchange submission: Web Automation with Claude, MATLAB, Chromium, and Playwright .The package includes:

Tier 1 -- Basic Examples:

- extractTableData.m + extract_table.js -- Web table scraping

- screenshotWebpage.m + screenshot_script.js -- Webpage screenshots

- downloadFile.m -- File downloads (direct and button-click)

- Example usage scripts for each

Tier 2 -- Production Automation:

- addFEXTagLive.m -- Conditional FEX tag management

- batch_tag_FEX_files.m -- Batch processing controller

- add_fex_tag_production.js -- Standalone Node.js automation script

- test_cdp_connection.js -- CDP connection verification

Documentation and Skills:

- INSTALL.md -- Complete installation guide (Node.js, Playwright, Chromium, CDP)

- README.md -- Package overview and quick start

- SKILL.md -- Best practices, decision trees, and troubleshooting (developed iteratively through the work described here)

The SKILL.md file deserves particular mention. It captures the accumulated knowledge from building and debugging this system -- selector strategies, download handling patterns, wait strategies, error handling templates, and the critical distinction between when to use Playwright versus MATLAB's native websave. It was developed as a "memory" for the AI assistant across chat sessions, but it serves equally well as a human-readable reference.

Credits and conclusion

This synthesis of existing tools was conceived by the author, but architected (if I may borrow this jargon) by Claud.ai. This article was conceived and architected by the author, but Claude filled in the details, most of which, as a carbon-based life form, I could never remember. The author has no financial interest in MathWorks or Anthropic.

Filter By

Channel

Results for

Web Automation with Claude, MATLAB, Chromium, and Playwright