In short – yes. I tried to make a few case studies, and it works pretty well even with the actual state of LLM (I used GPT-4o for the agent).
Here is a short video
So, to try this feature, check the new repository of the Browser Use script. It’s a Python script, so you must be familiar with using Python. But even if you do not understand it, you can use Cursor IDE in agent mode, which helps you install and build Python scripts. And I am sure that soon we will also have services that will do all the job for you.
So, for the scripts that I used in the video, here is code
from langchain_openai import ChatOpenAI
from browser_use import Agent
import asyncio
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
llm = ChatOpenAI(model="gpt-4o")
async def main():
agent = Agent(
task="""Go to site https://woocommerce-732526-3966075.cloudwaysapps.com/wp-admin and login with username ... and password ....
Do Next steps on site:
1. Go to Posts -> All Posts.
2. Find all posts with category 'Uncategorized'. Click on "Quick Edit" link in each post, deselect "Uncategorized" category, select "Reviews" category, then click on 'Update' button.
""",
llm=llm,
)
result = await agent.run()
print(result)
asyncio.run(main())
As you see, it’s pretty simple, you just need to give text instructions
Here is a bit more advanced because I used two agents. And first agent saves data in specific format, then send it to second agent. Second agent will take this format and make a post with the mapping of data to proper sections.
from langchain_openai import ChatOpenAI
from browser_use import Agent, Controller, ActionResult
import asyncio
from dotenv import load_dotenv
from pydantic import BaseModel
from typing import List
load_dotenv()
controller = Controller()
llm = ChatOpenAI(model="gpt-4o")
class TopComment(BaseModel):
text: str
author: str
class Post(BaseModel):
title: str
summary: str
comments_summary: str
top_comment: TopComment
class RedditResult(BaseModel):
posts: List[Post]
@controller.registry.action('Done with task', param_model=RedditResult)
async def done(params: RedditResult):
result = ActionResult(is_done=True, extracted_content=params.model_dump_json())
return result
async def main():
# First agent - Reddit Scraper
reddit_agent = Agent(
task="""Go to https://www.reddit.com/r/WPDrama/ and get 2 latest posts. Analyze and summarize each of them and analyze comments.
""",
llm=llm,
controller=controller
)
# Get Reddit content
reddit_result = await reddit_agent.run()
reddit_result = reddit_result.final_result()
if reddit_result:
reddit_result = RedditResult.model_validate_json(reddit_result)
else:
print("No Reddit result")
return
# Second agent - WordPress Poster
wp_agent = Agent(
task=f"""Login on site https://woocommerce-732526-3966075.cloudwaysapps.com/wp-admin with:
username: ...
password: ...
Do Next steps:
1. Go to Posts -> New post
2. Create a post.
Title should be: 'Latest news about WP Drama', content must traverse all {reddit_result} results and must be formatted using WordPress Gutenberg syntax. Example for first result:
<!-- wp:paragraph -->
<p><strong>News:</strong> {reddit_result.posts[0].title}</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><strong>Summary:</strong> {reddit_result.posts[0].summary}</p>
<!-- /wp:paragraph -->
<!-- wp:paragraph -->
<p><strong>Comments summary:</strong> {reddit_result.posts[0].comments_summary}</p>
<!-- /wp:paragraph -->
<!-- wp:quote -->
<blockquote class="wp-block-quote"><!-- wp:paragraph {{"className":"is-style-default"}} -->
<p class="is-style-default"><strong>Top comment:</strong> {reddit_result.posts[0].top_comment.text}</p>
<!-- /wp:paragraph --><cite>{reddit_result.posts[0].top_comment.author}</cite></blockquote>
<!-- /wp:quote -->
<!-- wp:separator -->
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<!-- /wp:separator -->
Make sure that content is in valid WordPress Gutenberg syntax. Do not use '\n' or other line breaks in content. Do the same for all results.
""",
llm=llm,
)
# Post to WordPress
wp_result = await wp_agent.run()
print("Reddit Result:", reddit_result)
print("WordPress Result:", wp_result)
asyncio.run(main())
Few words. It’s interesting to see how AI agent tries to understand what is happening on the page and click on the proper buttons. Sometimes, the agent does it on the second or third attempt, but usually, his actions are correct. You can improve instruction and show which element the agent must click for each step. Also, it’s possible to make a swarm of agents, and each agent can review previously made jobs.
Well, 2025 will be interesting.