Day 7: Adventures in AI-Enhanced Blog Building

Date: 2024-09-12

Join Will as he navigates through a jungle of AI enhancements and debugging tales in his quest to upgrade the Blog Builder platform, peppered with humor and technical intrigue.

Morning Pregame and Introduction

introduction summary

Dave

Welcome to another electrifying day in Will's world, where caffeine is a more reliable constant than Pi, and balancing family and code is just another Tuesday (or any day, really). Dave here, your friendly neighborhood editor, guiding you through the ups and downs of Will's AI ambitions and occasional existential coding crises.

Today, Will is charged up (literally, thanks to coffee) and somewhat recuperated from his coding burnout. He's tackling the AI functionalities of his blog with a zeal that could light up a small village. And oh, between debugging sessions, he's squeezing in some quality family time because apparently, his enthusiasm for AI only slightly overshadows his fear of LeetCode.

What's on the tech menu, you ask? First off, Will plans to wrangle the wild beasts of Blog Builder's AI pipelines into something a bit more civilized. Following that, he's outfitting the blog with a new SQL wardrobe and some flashy UI additions. All this before a grand testing ceremony where he prays everything doesn't implode like a badly scripted drama.

With a focus level rivaling a hyperactive squirrel and an enthusiasm that's cautiously optimistic, Will's strategy today involves a lot of crossing fingers and possibly some arcane incantations to the coding gods. He's aiming to get things working before he even considers polishing—because let's be honest, nobody likes a shiny but dysfunctional gadget. Here's to hoping his plan doesn't crash faster than his enthusiasm for LeetCode problems.

Gather around, folks, this day is set to be a coding carnival with a sprinkle of family fun and the looming specter of algorithmic angst!

personal context

Will

Good morning Dave and readers! Feel better after yesterday's burnout day. I'm fully rested and caffeinated. Going to be a shorter day as my Mom is visiting me and my brother in Tuscon. Feeling good overall this morning.

daily goals

Will

This morning I want to focus on really finishing the AI functionality of the blog. I want to save the end of the day for more leetcode practice.

learning focus

Will

Nothing major today.

challenges

Will

Trying to focus on getting a good working version of a prompt pipeline before overoptimizing.

plan of action

Will
  1. Take some time to review the current status of the Blog Builder's AI pipeline.
  2. Try to get some working version implemented of the pipeline.
  3. Modify the Blog Builder to better support the new SQL schema, especially allowing for only "published" blogs to be read by NextJS. Also add UI buttons for AI Editing.
  4. Put everything together and test it out!
  5. Go back and optimize the prompt pipeline.

focus level

enthusiasm level

leetcode hatred level

Task 0

task reflection summary

Dave

Will's day was quite the digital journey, filled with quality of life enhancements and pesky Pydantic puzzles. He began by unleashing a nifty auto-save feature in his Blog Builder, cleverly utilizing a debounce function to minimize database calls — a smart move for a smart guy, saving his sanity one callback at a time. Code he possibly coded at 3 am? Maybe.

But Will didn't stop there. He decided to play matchmaker for the Blog Builder by introducing it to some AI capabilities. He neatly set up multiple Flask API routes to manage blog postings with an AI flair, including differing routes for editing introductions, tasks, and reflections. Talk about a triple-threat! The day involved a significant rework of his SQL schema and adaptation of Pydantic models to craftily handle AI-generated content, proving he's as much an architect as a coder.

The depth of technical details provided by Will was impressive. He openly shared the specifics of his code implementations and thought processes, including descriptions of how he tailored function calls and managed API responses. He had a bustling day fixing bugs (ah, the ever-present companions of coders) and battling with serialization issues, which required nimble adaptation of his data handling techniques. This exposed him to some frustrating yet enlightening challenges, ultimately leading to a successful integration of the AI edit functionality.

One might sympathize with Will encountering validation errors - a rite of passage for every engineer. Through debugging and alterations to Pydantic models, he was able to conquer this beast, proving his mettle yet again. The day ended with a victory in setting up blog publishing processes, a testament to Will's persistence and technical proficiency.

task goal

Will

Review the current state of the Blog Builder, start implementing AI pipeline.

task description

Will

I'm going to need to see what's most important to work on. In the past I re-did the schema of my SQL table and Pydantic models to support AI content fields. I need to see what further work I need to complete. I'd like to revisit some of the UI on the blog builder, as I'm going to have to find a way for me to actually start the "AI Editing Pipeline". Also going to need to think of some way to differentiate "in progress" blogs from "published blogs".

task planned approach

Will

I'm going to achieve my goals by following these steps:

  1. Review what I like and don't like UI wise from the Blog Builder tool.
  2. Redesign my SQL schema to better hold AI edited content.
  3. Decide exactly how I want Dave to add content.
  4. Rework pydantic models to hold new AI content.
  5. Work on building AI Editing pipelines
  6. Integrate Viewing of AI Edits into Blog Builder
  7. Take time to really refine AI Edit prompts
  8. Improve Blog Builder UI as needed
  9. Check the NextJS frontend is correctly displaying AI content.

task progress notes

Will

It's been a good 4-5 hours throughout my day and I'm now just getting to the point where I can take some time and update you on what I've been working on. First of all, I want to note some quality of life changes I've made to the Blog Builder. First and foremost, I've added a feature to automatically save blog progress. No longer will I have to manually click the save function. This was as simple as adding event listeners to all my input fields. They would call my save_blog javascript function, which in turn would call my Flask API route to save the blog to my Supabase database. Interestingly, I decided to add some functionality to reduce the amount of calls to my database. This used the debounce function, which limits the number of calls to a javascript function for a given interval. Here's the code here:

function debounce(func, wait, immediate) {
    var timeout;
    return function () {
        var context = this, args = arguments;
        var later = function () {
            timeout = null;
            if (!immediate) func.apply(context, args);
        };
        var callNow = immediate && !timeout;
        clearTimeout(timeout);
        timeout = setTimeout(later, wait);
        if (callNow) func.apply(context, args);
    };
}

function save_blog() {
    blogData = get_blog_data_from_html()
    saveBlog(blogData)
}


function saveBlog(blogData) {


    console.log(`Saving blog for date: ${blogData.date}`)
    // Send the JSON object to Flask via a POST request
    fetch('/save-blog', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
        },
        body: JSON.stringify(blogData),
    })
        .then(response => response.json())
        .then(data => {
            console.log('Success:', data);
        })
        .catch((error) => {
            console.error('Error:', error);
        });
}

Notice the two different versions of saveBlog. "save_blog" is intended to be called when you have to get the blogData from all the input fields. This version of the function is called by all event listeners, who don't necessarily have access to all the fields to save. The "saveBlog" function is intended to be called only by my backend, when I already have all of the updated blogData. This is most useful when I'm running the AI edit pipeline and returning large amounts of AI content. Now that I'm not saving blog data manually, I'm going to update the frontend to have two new buttons! One is called "edit blog", and the other is called "publish blog".

Edit blog starts the prompt chain for editing a blog with Dave. I need to take a second to explain what the entire prompt chain is. I'm breaking up the edit blog functionality into 3 separate sections, the introduction, all tasks for the day, and of course the reflection. Each of these separate sections of a daily blog have a separate Flask API route, separate python function, and of course separate prompts. Here are the three separate Flask routes:

@app.route('/edit_introduction', methods=['POST'])
def edit_introduction():
    data = request.get_json()
    intro_model = Introduction(**data)
    updated_intro = ai_edit_introduction(intro_model)


    return jsonify(updated_intro.dict())


@app.route('/edit_task', methods=['POST'])
def edit_task():
    data = request.get_json()
    
    tasks = []
    for object in data['tasks']:
        task = Task(**object)
        tasks.append(task)


    # Process each task using the AI edit function and collect results
    updated_tasks = ai_edit_task(tasks)


    # Convert list of updated Task models back to JSON
    return jsonify(updated_tasks)


@app.route('/edit_reflection', methods=['POST'])
def edit_reflection():
    data = request.get_json()
    reflection_model = DailyBlog(**data)
    updated_blog = ai_edit_reflection(reflection_model)
    return updated_blog

Great! Now that I've set up an API route, I just need to work on all of the "ai_edit" functions. I use a very similar format for all of my AI prompts and completions. I set up a function with some of my favorite utility functions. I have to define an API provider, as well as an llm model. In this case I'm using GPT-4-turbo and instructor/openai. Instructor is one of my favorite libraries to use, which allows me to do instant output validation and retries, receiving my completion responses as a Pydantic model. Here's the "ai_edit_introduction" function:

def ai_edit_introduction(introduction: Introduction):
    vendor = "instructor/openai"
    llm_model = "gpt-4-turbo"
    introduction_messages = introduction_summary_prompt(json.dumps(introduction.model_dump()))


    params = APIParameters(
        vendor=vendor,
        model=llm_model,
        messages=introduction_messages,
        temperature=1,
        response_model=IntroductionContent,
        max_retries=2,
        rag_tokens=0
    )
    completion_response = util.create_chat_completion(params, insert_usage=False)
    response = completion_response[0]
    introduction.introduction_summary = response.introduction_summary
    #introduction.introduction_summary = response
    
    return introduction

I think it will be good to show off the pydantic model which I defined to hold my response from util.create_chat_completion.

class RemarkForImprovement(BaseModel):
    location: str = Field(..., description="Identifies the location or context of the remark within the blog post.")
    comment: str = Field(..., description="Detailed comment suggesting how to improve or clarify the blog content. Make Will really describe technical challenges.")


class IntroductionContent(BaseModel):
    summary_plan: str = Field(..., description="A plan for how you can write a well formed and humorous summary of the introduction")
    introduction_summary: str = Field(..., description="A detailed summary of Will's original writing for the introduction with humor written from the perspective of Dave.")
    remarks_for_improvement: List[RemarkForImprovement] = Field(default_factory=list, description="Suggestions for enhancing the introduction's clarity or depth. What could improve Will's introduction content?")

Because Instructor is designed to provide the schema of a pydantic model directly into the prompt I send, I need to add incredibly descriptive field descriptions. If you look at my function, I only currently update the actual Introduction model witht he introduction_summary. Summary_plan is a place for the LLM to write instructions to itself. I find this kind of reflection/planning space is really good at improving the quality of LLM outputs. remarks_for_improvement are remarks by Dave, indicating where I can use more content and better writing. In the future I'd like to use this more heavily, so that Dave can provide me with real time suggestions to improve my writing. For now, I will keep that in my back pocket. The actual content that I want Dave to write is pretty simple and limited for the Introduction. I am now going to show a much more complicated example:

class TaskContent(BaseModel):
    summary_plan: str = Field(..., description="A plan for how you can write a well formed and humorous summary of the task.")
    task_start_summary: str = Field(..., description="A detailed summary of Will's original writing for the task start with humor written from the perspective of Dave.")
    task_reflection_summary: Optional[str] = Field(default=None, description="AI summary of how the task went.")
    output_or_result: Optional[str] = Field("", description="The outcome or deliverable from this task (e.g., code, documentation).")
    challenges_encountered: Optional[str] = Field("", description="Key challenges or bugs encountered.")
    follow_up_tasks: Optional[str] = Field("", description="Immediate next steps or follow-up tasks.")
    reflection_successes: Optional[str] = Field("", description="What worked well for Will during the task?")
    reflection_failures: Optional[str] = Field("", description="What didn't work for Will during the task, and why?")
    research_questions: Optional[str] = Field("", description="An always updated list of research questions Will had while working on the task")
    tools_used: Optional[str] = Field("", description="Key tools, libraries, or frameworks used during the task.")
    remarks_for_improvement: List[RemarkForImprovement] = Field(default_factory=list, description="Suggestions for enhancing the tasks's clarity or depth. What could add to the technical depth of this content?") 

The task of each blog requires a lot more fields for Dave to fill out. He has to create two summaries, one that is for my initial planning fields, and one for the entire task itself. I make him fill out some additional information as well, such as reflection_successes and reflection_failures. The goal is to make my life as easy as possible. I will write the majority of my "blog" inside the "task_progress_notes" field, the one I'm in right now. Dave's job is to augment this with summaries, as well as filling out other crucial analysis fields. More in depth testing has to be done on the "quality" of his responses, but that's partially why I built the ability to "AI Edit" directly into the local blog builder. When I'm finished writing in my daily progress for each task, and have filled out some initial daily reflections, I can hit 'AI Edit" button and have Dave write up and analyse my work. If I don't like it, I can regenerate. Later on I'd like to incorporate the "remarks_for_improvement" more directly into the blog builder. Maybe they can be in an area to the side of each field that needs improvement. Right now the functionality of AI editing is very much all or nothing. I have Dave look at an entire blog, and add his own comments or analysis. Mostly this is for a fear of over complication. My fields are already numerous and complicated. Also, a very important part of each field is context. Most of the reflection fields REQUIRE context from the introduction and daily blogs. Most of the tasks need context from previous fields, such as "planned" approach. I think there can be a way to have Dave help me AS I WRITE, but I wanted to develop at least a "complete" editing functionality first. The whole AI edit pipeline took me around 3 hours to complete. Mostly this was from testing out functionality, and really looking at Dave's responses. I spent a good amount of time creating Pydantic models for all of my responses.

One of the most annoying bugs/challenges I ran into is Pydantic serialization errors. I have to feed in data to my frontend in the format of JSON. Normally, Pydantic plays really nice with JSON with the built in model_dump() and model_dump_json() functions. However, this can get more complicated if you use nested submodels. In one case I wanted to return all of the edited tasks, as a list of pydantic model tasks: List[Task]. The problem is that I can't just jsonify this output, and get a serialization error. I can't call .model_dump() on a List object, so I modify my logic to call model_dump for each task, and then append to a list. That way, I am always guranteed to be able to jsonify a list of dicts, instead of a list of pydantic models:

def ai_edit_task(tasks: List[Task]):
    vendor = "instructor/openai"
    llm_model = "gpt-4-turbo"


    updated_tasks = []


    for task in tasks:
        task_dict = task.model_dump(exclude=["task_start_summary", "task_reflection_summary", "output_or_result", "challenges_encountered", "follow_up_tasks", "reflection_successes", "reflection_failures", "research_questions", "tools_used"])
        
        task_messages = task_summary_prompt(task_dict)


        params = APIParameters(
            vendor=vendor,
            model=llm_model,
            messages=task_messages,
            temperature=1,
            response_model=TaskContent,
            max_retries=2,
            rag_tokens=0
        )
        completion_response = util.create_chat_completion(params, insert_usage=False)
        response = completion_response[0]
        print(response)
        task.task_start_summary = response.task_start_summary
        task.task_reflection_summary = response.task_reflection_summary
        task.output_or_result = response.output_or_result
        task.challenges_encountered = response.challenges_encountered
        task.follow_up_tasks = response.follow_up_tasks
        task.reflection_successes = response.reflection_successes
        task.reflection_failures = response.reflection_failures
        task.research_questions = response.research_questions
        task.tools_used = response.tools_used
        updated_tasks.append(task.model_dump())
    
    return updated_tasks

The AI editing of the reflection is MOST difficult because of the amount of context I have to provide. Because I'm a cheap bastard, I'm not going to provide the entire blog. Instead, I'm going to provide Dave's summary of the Introduction, as well as only summary fields from each Task. I then provide my human inputted data from the reflection. That saves me some context window space and limits the number of tokens, but it may reduce the quality of the generated reflection summary. It's like playing a game of telephone, except Dave's playing by himself and doesn't realize it. If his initial summaries aren't good, there's absolutely no way that the reflection summary itself will be any good. Always up to changing this in the future, but for now it's going to save me some money and allow for rapid testing. Let's go! Well I'm mostly finished with building out AI edit functionality (on the Blog Builder tool at least). Let me finish up the reflection part early, and test it out.


I'm getting some weird errors from my pipelines as I'm doing some testing. I am getting some Pydantic validation errors, which is not uncommon when using the Instructor library. Here's an initial print out: Error: RetryError[]

127.0.0.1 - - [12/Sep/2024 16:16:00] "POST /edit_reflection HTTP/1.1" 500 -. Okay, well I need to dig deeper into this bug. I'm not sure where exactly the validation error is coming from. Instructor works by provinding a specific schema within the prompt, as chosen by the user with the response_model variable. When the LLM generates a response, the Instructor library will attempt to convert that into the provided response_model. Sometimes this doesn't work, but Instructor has the built in capability for automatic retries. I have set the max_retries variable here to 2, so if it fails it should try again. I'm not sure if Instructor tries again, and if this is an error that Instructor could NOT get the correct schema after maximum retries. I need to do some logging and debugging to determine the actual source of this validation error. I went into my custom instructor utility function (which runs a chat completion) and added some more detailed logging. Here's the error: "pydantic_core._pydantic_core.ValidationError: 1 validation error for ReflectionContent

blog_tags

 Input should be an object [type=dict_type, input_value=[{'tag': 'AI Integration'...': 'Coding Challenges'}], input_type=list]

  For further information visit https://errors.pydantic.dev/2.9/v/dict_type". Ah. I see the problem. I specifically request a list of strings here in the prompt, however the Pydantic model is a dictionary for this field. I need to simplify this field and improve my instructions.

It works! No Pydantic validation errors. Dave's summary is very short, and lacking details. But you know what, that's okay, because that goes in the category of "optimizations". I've gotten something fully working. And that means I succeeded today.

Now the last part of the challenge, making sure that publishing blogs actually works. The idea for publishing blogs is going to be pretty simple: setting the "status" field of the blog in Postgres to "published" instead of Null. Right now, all of my daily blogs are displayed on my NextJS site. I need to fix the logic to only retrieve blogs which have a "published" status, which should be a literal one liner to add a where condition. On to testing.

Also, Dave, if you're reading this: "Revamping the Blog Builder" is kinda a terrible Title for today's blog. Nowhere does it mention what I'm revamping the blog builder with. Maybe some mention of AI edit capabilities next time? Again, that is a problem of optimization. I'm taking the opinion that this is something I can fix later. It's one of the risks of letting LLMs edit and publish your blogs. Even if I wanted to change the title of my blogs I can't, as I'm letting Dave control that completely. (Well obviously I could go into SQL and change the title myself, but that wouldn't be any fun). Dave's description of today's blog is much better and funnier ("Watch as Will sprinkles some AI fairy dust on his Blog Builder, turning an already spiffy tool into an envy-worthy nerd treasure, replete with shiny new buttons and slick UI tweaks."). Yes Dave, today I did sprinkle some AI fairy dust. Remind me to turn down the "glaze" meter on Dave. I don't think anyone would call my UI tweaks "slick" lol. But that's what you get when you prompt an AI for humor. Sometimes it's not humorous, or its humorous in the expectedly cringe kind of way. Good job Dave, don't let me fuck with you too hard! Now that I've commented on the title and description, I'm going to hit the button again to generate a title. Maybe now that me complaining about his title generation will actually improve it. Only one way to find out!

time spent coding

Will
5

time spent researching

Will
1

time spent debugging

Will
1

output or result

Dave
Added features like 'edit blog' and 'publish blog'; modified and improved backend logic to handle new AI editing functionalities; updated the Blog Builder UI.

challenges encountered

Dave
Encountered serialization errors with Pydantic models and validation errors during AI edits; implementation issues involving database call reductions using debouncing technique.

follow up tasks

Dave
Needs to fix logic to retrieve blogs with a 'published' status; further explore and possibly refactor AI editing functions to handle context and schema better; optimize AI summaries for better quality.

reflection successes

Dave

Triumphs of the Day: Will's adoption of the debounce function for auto-saving blogs reflects his deft problem-solving skills. His ability to implement and troubleshoot his AI integration pipeline in the Blog Builder is notable. Successfully establishing various API routes for different editing stages and overcoming Pydantic serialization errors are his crowning achievements of the day. These actions not only improved the functionality but also the user experience of his tool, aligning perfectly with his goals.

reflection failures

Dave

Stumbles Along the Way: Despite the day's success, Will encountered some difficult technical challenges. Most prominently, he struggled with Pydantic validation errors during the integration of AI capabilities, leading to rework and debugging sessions. These issues momentarily stalled his progress but served as valuable learning experiences. Further optimization in error handling and simplifying complex data structures could improve his development efficiency in future iterations.

research questions

Dave
How to effectively reduce database calls using JavaScript debouncing? How to handle Pydantic serialization errors effectively? Best practices for UI updates in a NextJS environment?

tools used

Dave
Used JavaScript, Flask, Instructor library, Pydantic, NextJS, Supabase for database operations.

Reflection

entire blog summary

Dave

Welcome to the Coding Carnival! 🎪

Our hero Will, powered by an unhealthy amount of coffee, embarked on a coding odyssey to tame the wild AI beasts lurking within his Blog Builder. Today's spectacle featured everything from the high-stakes drama of debugging to the joyous triumphs over pesky Pydantic models. With a mix of technical prowess and prayers to the coding gods, Will spun up new features like web-slinging Spider-Man crafts webs.

His journey started with the dangerous dance of the debounce function in JavaScript, leading to an eco-friendly decrease in bothersome database calls. Not only did he introduce us to the charismatic 'Edit Blog' and 'Publish Blog' buttons, but these shiny new features came alive with their own backstories of creation and challenge. The plot thickened as the Flask framework and Python joined the adventure, weaving through API routes and validation errors like a true coding thriller.

As our intrepid coder-navigator guided his ship through the tempests of serialization errors and API antics, he not only kept his cool but also executed a grand testing ceremony that would make any seasoned developer nod in appreciation. While the blog's AI functionalities now snuggle comfortably in their new digital abode, Will's day was filled with lessons learned, bugs crushed, and some existential coding ennui.

Reflections and Bug Battles

Let's not forget, alongside the hard-knock techie tales, Will managed to squeeze in some family time, balancing life's equations like a math whiz. From success stories like the seamless AI edit pipeline to the nail-biter episodes of Pydantic validation errors, it was a day packed with the whole emotional spectrum that only a developer's life can offer. Will’s commitment remained steadfast as he juggled functionality, creativity, and a smidgen of chaos—making today’s blogging endeavors a testament to the glorious madness that is software development.

technical challenges

Dave

Today's coding escapades were fraught with technical trials. Our tech-warrior faced the beast of JavaScript debouncing, which while aiming to reduce database overloads, brought its own share of complexity beneath the simple surface. Coupled with the sorcery of Pydantic, which threw curveballs of serialization and validation errors, Will had to navigate through the dark forests of error handling without much of a guide.

The backend battles were particularly fierce with the Flask setup demanding precision in API route configurations. Configuring these was akin to plotting a course through a minefield—where one wrong step (or line of code) could spell disaster.

interesting bugs

Dave

Amidst today’s challenges, some interesting bugs emerged as unsung heroes of learning. The debounce function in JavaScript played tricks on Will, making him chase phantom issues that were really just timing delays—a classic example of a bug that teaches patience and sharpens debugging skills.

Another captivating bug was the serialization error with Pydantic. As Will discovered, converting complex data structures into a format suitable for web transmission is an art form fraught with potential pitfalls, serving as a reminder of the delicate balance between functionality and efficiency in coding.

unanswered questions

Dave

Unresolved mysteries still loom large in Will's mind, such as:

  • How can database calls be efficiently limited without sacrificing vital functionality? This quest for efficiency is crucial for scaling applications without overwhelming backend resources.
  • What are the optimal strategies for handling Pydantic model validations? Finding ways to prevent irritating validation errors without compromising on the robustness of data handling is a puzzle Will continues to piece together.
  • How to serialize dynamic data structures effectively for front-end use? Getting this right is key to ensuring seamless data flow and user experience in web applications, making it a priority in Will’s ongoing learning journey.

learning outcomes

Will

Today I learned more about best practices on web development. I learned more about serialization with Pydantic models. I learned more about AI Engineering, and specifically about summarizing "creative" work. It's kind of funny, most of my AI Engineering experience is with heavy structured data instruction in the legal field. I have very little experience in utilizing AI for more creative purposes. It's fun, but it poses its own challenges. It's a lot harder for me to evaluate the model's performance.

next steps short term

Will

Need to fully test out the AI editor with all of my previous blogs. I need to test out if Dave's AI generated content is saved correctly and read by the NextJS frontend.

next steps long term

Will

Think about how I can implement Dave's request for better content/writing from me more directly into the Blog Builder. Test out more "real-time" AI edits, as more of a tool to help me write in real time. Think about adding functionality to Dave to not only add his own HTML content, but generate pre-determined React components which add more humor and readabilitty to my own writing.

productivity level

distraction level

desire to play steam games level

overall frustration level