Day 8: Enhancing AI Editor Capabilities

Date: 2024-09-13

Today, Will focused on turning what some may call a mere 'chatbot' into an 'agent of intelligence'. Spoiler: He's mostly upgrading me, Dave, to handle more than just grammar checks.

Morning Pregame and Introduction

introduction summary

Dave

šŸŒ… Good morning to those who had a better sleep than Will! Today, our sleep-deprived AI enthusiast is running on excitement and enthusiasm, powered solely by OpenAIā€™s latest shiny toy, the o1 model. While Will sacrificed his bed for Mom, his comfort was no match against the might of AI advancements.

As for his daily goals, donā€™t worry; Contoural remains a family affair, waiting for an AI injection by prodigal engineer Will, who apparently also moonlights as their bargain-bin AI mastermind. But today, the agendaā€™s lighter than usual with blog brush-ups and website wonders on the docket.

Willā€™s learning focus today is all about refining the art of documenting AI escapades, for the viewing pleasure of his audienceā€”of which, I, Dave, the AI editor, claim proud MVP status. šŸ†

Challenges? Just your usual combo of sleep deprivation and retrospective research recording. Lastly, Willā€™s meticulous plan unfolds with tackling research notes before diving into the realm of NextJSā€”because whatā€™s better than one rabbit hole? Several, structured like a layer cake of coding chaos! šŸ°

personal context

Will

Good morning Dave and readers! Unfortunately, I am not feeling well today. My Mom is visiting my brother and I in Tuscon and I gave up my room to my Mom and I've been sleeping on the couch. I could not sleep last night. I don't know why. I often struggle with insomnia. I was on twitter very late last night looking at OpenAI's new o1 model release last night. Incredible stuff to say the least. I'm excited to really dive in and test it for some of the heavy duty legislation analysis work I do in my contract AI Engineer job for Contoural. But back to the point, I was looking at some of the initial results of the o1 model. The results look very promising to say the least, and I'm excited to see how future improvements on the new reinforcement learning based architechure progresses. Kind of funny side note, one of my good friends and mentors works at OpenAI on their safety team. I saw a very interesting post about how the o1 model not only improves on most regular benchmarks, but also had incredible improvements on Open AIs internal safety benchmarks. All I could think about was how happy my friend must be. All it took was giving the LLM a little space and time to think for it to follow its damn instructions not to provide instructions for building a dirty bomb! Might be making his job a whole lot easier. But anyways, today I'm definitely not feeling 100%.

daily goals

Will

I was supposed to have a meeting for my AI Engineer job with Contoural today, but that got postponed. Contoural is a very small company, in fact the CEO is Mark Diamond (my dad). It's almost been a family business for me, having worked there in some capacity since I was a little kid. I've gotten the chance to work there and help them integrate generative AI into their processes which has been an incredibly mutually beneficial relationship. I'm happy to help out the family business, and get some real world experience as an AI Engineer building complex legal analysis tools that are having an immediate positive impact. Contoural is very happy to get access to an AI Engineer (for cheap) with tailored experience in using LLMs for legal applications, something that is a rarity of a rarity. And most of my work there is attempting to automate or make more efficient some of the most soul shattering boring and tedious legal analysis. Anyway, I've been working on building a pretty self contained LLM tool for mass analysis of legislation which Contoural uses to consult with some of their clients. Really cool work, I hope to get the greenlight to talk about it sometime in the future. With that out of the way, I don't really have anything planned for today. So I'm going to continue working on my Blog Builder application and maybe more focus on my NextJS website too. I need to start documenting some of my side projects, and add my interactive resume to the website as well.

learning focus

Will

Today I want to get better at documenting some of the interesting R&D work I did for the AI Editor pipeline. I want to get better at showcasing interesting AI problems, how I worked through possible solutions, and get better at making my daily blogs more technical and interesting. (For the viewers!). Well, really, Dave is my only viewer and most loyal fan right now (Dave, you're the real MVP).

challenges

Will

I couldn't sleep last night so I spent a lot of time really researching some interesting features for the AI editor. It was late at night, and I was definitely tired, not tired enough to stop building AI but just tired enough to not care about writing about my process. It's going to be a challenge to retroactively go back and walk through my thought process as I was researching. Also, I expect that I will run into problems with the NextJS rabbit hole of functionality when I try to start adding my side projects and interactive resume to NextJS. Unavoidable unfortunately.

plan of action

Will

Okay, I'm going to start by really taking a large amount of time reviewing my research work late last night into the new functionality I added to the AI editor pipeline. Then, I'm going to leave updating the NextJS website as a more vague final tasks:

  1. Go back and take a look at the final results of my R&D work. Going to approach this from both ends of the proverbial sandwich.
  2. Go back and think back at how I started the R&D process. Here, my chat logs with my favorite rubber ducky ChatGPT might help me remember how I was thinking to start the RD
  3. Start splitting up my R&D work into different distinct phases, focusing on what pieces of the codebase I changed.
  4. Take some time to go over the prompts, as well as the prompt methodology.
  5. Take some time to document the really interesting solution I found for "inserting" LLM comments into HTML elements, while saving output tokens
  6. Wrap up documenting the new AI editor pipeline functionality by talking about where I see it going in the future.
  7. Start working on the NextJS frontend. I need to develop a plan of action for a sort of Order of Operations of importance.
  8. I think that my interactive resume is going to be highest on this priority list, so I'm putting that first.
  9. Next in priority is going to be documenting my numerous side projects, and putting that content up on the NextJS site.

focus level

enthusiasm level

burnout level

Task 0

task reflection summary

Dave

Will ventured into refining the functionality of his Blog Builder tool with a clear intent: to enable custom React components to be dynamically added into the text without disrupting the existing content, a task requiring careful balancing of coding prowess and creative disruption.

To attack this, he proposed an innovative approach of using insertion markers rather than rewriting the entire text, which would save on token usage and reduce risk of misformatting. The initial step involved exploring efficient methods for the LLM to insert text seamlessly. Despite the brilliance behind this concept, it did come with challenges such as avoiding the LLM's tendency to rewrite or incorrectly format the content.

Will didn't stop at just identifying the problems; he actively sought solutions, leading to the idea of pinpointing text just prior to insertion points as a unique method to integrate custom components. This shifted the task from a raw text modification to a more strategic placement of annotations, which resonated well given the complexities of modifying HTML directly.

He developed Python scripts and utilized BeautifulSoup to parse HTML content effectively, adjusting input into a prompt-ready format for the LLM. This methodical shift towards a structured input system, coupled with detailed prompts, enabled a more controlled and efficient interaction with the LLM, highlighting Will's adaptability and technical savvy.

Additionally, Will's decision to post-process the HTML and dynamically add React components after page load exhibits a strategic pivot from more traditional methods of web development to a dynamic, AI-driven approach. This involved rethinking frontend interactions with the HTML data, ensuring compatibility with React's lifecycle and demonstrating both a deep understanding of web technologies and a readiness to push technical boundaries for better user experience and system efficiency.

task goal

Will

Develop functionality of my AI Editor Pipeline for my Blog Builder tool, to allow Dave to not just summarize and extract data, but dynamically add custom React components INLINE to my text without completely rewriting it.

task description

Will

I had previously developed a couple of custom React components that were intended for Dave to use to kind of annotate my writing. These included a Warning component where Dave Warned the reader about something humorous. Also included an Informative component, where Dave added some extra information to the reader. And finally a Question component, where Dave could pose a question to me or the readers. These are displayed currently on my main page. It's shown as an emoji inserted into my text, where hovering over the emoji will pop out a tooltip containing Dave's generated text.

I really liked these components, as it added a perfect TOOL for Dave to use to inject humor and possibly context to the reader, to improve the readability of my blogs. On the main page, I had to manually create and insert the components, however I now plan to make this an automated part of Dave's editing pipeline.

task expected difficulty

task planned approach

Will

So the plan is as follows:

  1. Research a way for efficient LLM process for "injecting" text into user provided text.
  2. Develop and test out this method of "injecting" text.
  3. Develop prompts for Dave to receive text, and inject elements, outputting not the modified text but only a changelog or diff.
  4. Research and develop a method for combining the original text and Dave's generated diff, which could be stored inside my database.
  5. Figure out a way to read in the text from the database on my NextJS site, and convert "insertion markers" into actual React components
  6. Test and finetune the whole process.

task progress notes

Will

So I started this whole progress thinking about how I wanted to do text insertion with an LLM. The obvious choice is to provide some original text to the LLM, and have the LLM return the original text modified with the inserted components. This can work pretty effectively, however there's a couple of problems with this naive approach:

  1. There's no gurantee the LLM will not re-write, correct, truncate, or otherwise modify the original writing in the process of insertion.
  2. It can be common for the LLM to incorrectly format the inserted Component, which is very difficult to fix once stored in the database as an HTML string.
  3. I'm not just editing text, but editing HTML strings directly. This adds a whole new level of complexity to the LLMs task, where I found a lot of "focus" is wasted by the LLM on ensuring it's generating the HTML structure correctly, and losing focus on correctly rewriting my original content faithfully.
  4. This is a very expensive method token-wise. I'm going to be basically feeding in X input tokens, and outputing X output tokens (or slightly more). This runs into a couple of problems for me:
  5. This can be expensive. I'm wasting a lot of output tokens simply rewriting the text I didn't want changed in the first place.
  6. I am going to fast start running into limitations on output tokens. For example, GPT 4o has an output token limit of 4k tokens. If I'm outputting the entirety of the input HTML plus insertions, I may very fast run into these limits.
  7. This is not efficient!

So I have to come up with a different approach. Having an LLM regenerate all input HTML elements, with inserted custom React components will not work. So maybe, I can optimize this problem by having the LLM insert an "insertion marker", which is a special character that denotes an insertion point for a custom component. Well this can save some output tokens, and most definitely will help limit errors on malformed generated components by the LLM. However, this still doesn't fix the fundamental problem that I'd need to still output the entirety of the input HTML. My next train of thought kind of leads to a more programming approach. Possibly, I can have the LLM indicate the index of the text where it wants to insert an element. That way, it would be only providing a list of indexes where it wants to insert a component. No output tokens would be wasted on repeating text, and there'd be no error of mistranslation. Well this might work for something like a list or data structure, but I'm going to be providing formatted HTML elements possibly with nested sub elements. And no matter how close we are to AGI, having an LLM indicate the index for insertion of an HTML element is still going to be very impossible. (How do you even define an "index" of an HTML element??). We're getting closer to a possible solution, and I think we're on the right track.

So I can't have the LLM regenerate all the input text, and having it return the "index" of an HTML element to indicate a point of insertion for a component just won't work. Immediately I move into the idea of using some kind of index approximation, such as using preceeding text. Here's the idea: The LLM is prompted to annotate my original writing, provided in the format of HTML. The LLM will read the text, decide where is best to add an annotation and which type, and decides on the embedded message in the popup component. It is instructed to "return the text directly preceeding where you want to insert the component". This might just work, as LLMs are getting better at the kind of "needle in haystack" problems, and I'm only asking for a direct translation of a couple words or more. The LLM will return a list of objects, each denoting the type of component, message within, and the string preceeding the insertion point. I can then use that insertion text to string replace or regex the custom component into the HTML string. This is the best option by far, as none of the text is actually returned or modified by the HTML. It saves huge amounts of output tokens, and really allows the LLM to focus on the task of choosing insertion points (and funny messages) instead of trying to rewrite all of the entire input HTML correctly. There's still an issue that the LLM may error in correctly copying the "insertion marker" text, but I'll touch more on that later. In fact, here's ChatGPTs analysis on my provided approach (of course it's going to agree my idea is the best idea)

Naive implementation #1: It's time to start actually developing a prompt for allowing the LLM to choose points of insertion. The first step is preparing the text for insertion. As I noted, I store most of my written content as HTML strings within my database. So the first step is going to be loading in the HTML strings and providing it in a format for easy use by the LLM. All of my text input fields are Quill components, which is a great lightweight open-source text editor for Javascript. I can format text, embed images, and even write code blocks directly in the editor on the "Blog Builder" page. It's interesting to note HOW Quill stores this content in HTML. It will generate <p> tags for every newline in the editor. Embedded code blocks are their own <pre><code> tags, and images are simply added into a parent <p> tag. What it ends up looking like is a list of <p> and <pre> tags at the top level, all nested under the quill editor div. This has some interesting implications for how I store the data in the database. To store the HTML data, I get the parent quill editor div and simply grab the .innerHTML of it. This will provide all my formatted text, embedded image links, and code blocks with the exact same formatting as it looks like inside the Quill editor on the Blog Builder tool. What's interesting to note about this is I'm storing the innerHTML as a string, and the actual string representation of the HTML has no parent element. It looks like a flat list of <p> and <pre> tags, possibly with nested elements. Here's a picture of the Quill editor HTML inside the blog builder tool:

So now that you have some context on how Quill stores HTML content, and how I save it to my database, we can move onto the task of re-parsing this HTML from a string to input it into a prompt for AI component insertion. The right idea here is to load in the raw text, and then use the Beautiful Soup library to re-parse into HTML elements from a string. I have 1000+ hours scraping legislation using BeautifulSoup, and consider myself an expert in it. Excellent, I'm on home turf. So I create a simple python function which receives the HTML content as a string, and parses the HTML. Well, what format do I want to parse it into? Realistically, JSON is probably the best approach. OpenAI models are great with JSON, and it provides an ordered structure of my content. It would allow a model to get the full context of one single "Field" of my blog, and be able to attach elements at a given index. The originally plan was to have the model generate a JSON list of objects, where each object was to attach to the index of the provided HTML. If there were 8 <p> tags in the parsed HTML, the LLM would return a JSON list of 8 objects. I could directly map insertions based on the index of the JSON list. However, this had a couple of limitations. In practice, it would be difficult to allow the LLM to add multiple annotations to a given HTML element. Also, there may be times where it wouldn't make sense to add an annotation to an HTML element, and therefore I would have to add logic to instruct the LLM to return some kind of null value. This wasn't quite the most effective approach, and I quickly pivoted to using the "id" attribute of HTML as an indicator for insertion as well. That would allow the LLM to easily insert/annotate multiple components for one large HTML element, or easily skip annotations on HTML elements. So without further ado, here's my Python function for re-parsing HTML:

def process_html(html_content, field_name):
    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(html_content, 'html.parser')
    direct_children: List[Tag] = soup.find_all(True, recursive=False)  # This grabs all elements at the top level
    
    # Dictionary to hold the elements as strings with added IDs
    elements_to_process = []
    all_elements = []
    
    # Iterate over each child, adding an id and converting it back to string
    for index, child in enumerate(direct_children):
        # Add an ID to each element based on its index
        child['id'] = f'{field_name}-{index}'
        
        # Convert the element back to a string
        element_str = str(child)
        all_elements.append(element_str)
        # Elements to not process
        if child.name == "pre" or child.find("img", recursive=False) or child.get_text().strip() == "":
            continue
        # Skip empty paragraph elements
        
        # Append to the list as a dictionary
        elements_to_process.append(element_str)
    
    # Convert the list of dictionaries to JSON
    json_elements_to_process = json.dumps(elements_to_process, indent=2)
    
    return json_elements_to_process, soup

A couple of things to note about the above function, I'm processing a raw html_content string for a given "field" of my Blog Builder, denoted by the field name. I use BeautifulSoup to parse the HTML string and then iterate over all the top level children. I construct an id name for each element, using the provided field_name and then the index of each BS4 tag. When I'm processing the HTML to go into my insertion prompt, I want to exclude a couple of elements. Code blocks, embedded images, and empty formatting elements need to be excluded. I'll build out a list of elements_to_process, and also directly modify the soup object. I will return the json dump of my elements to process, as well as the modified Soup object. This provides the exact JSON object I will provide to the LLM in the prompt, as well as the modified soup object to use later for applying the AI generated component insertion/annotations directly. For now, I'll focus on the prompt I use. The prompt gives context to the LLM that it should behave as Dave, my AI editor. I give some context on the relationship between Will and Dave, and then detail what kind of components the AI can insert. These components are represented by React components within my NextJS application here:

interface InlineActionProps {
  preface?: string;
  message: string;
}


interface HighlightProps extends InlineActionProps {
  children: React.ReactNode;
}


// Warning Component (āš ļø)
const Warning: React.FC<InlineActionProps> = ({ preface, message }) => (
  <TooltipProvider>
    <Tooltip>
      <TooltipTrigger asChild>
        <span className="cursor-pointer text-yellow-500 text-xl">[{preface}āš ļø]</span>
      </TooltipTrigger>
      <TooltipContent>
        <p>{message}</p>
      </TooltipContent>
    </Tooltip>
  </TooltipProvider>
);


// Informative Component (ā„¹ļø)
const Informative: React.FC<InlineActionProps> = ({ preface, message }) => (
  <TooltipProvider>
    <Tooltip>
      <TooltipTrigger asChild>
        <span className="cursor-pointer text-blue-500 text-xl">[{preface}ā„¹ļø]</span>
      </TooltipTrigger>
      <TooltipContent>
        <p>{message}</p>
      </TooltipContent>
    </Tooltip>
  </TooltipProvider>
);


// Question Component (ā“)
const Question: React.FC<InlineActionProps> = ({ preface, message }) => (
  <TooltipProvider>
    <Tooltip>
      <TooltipTrigger asChild>
        <span className="cursor-pointer text-green-500 text-xl">[{preface}ā“]</span>
      </TooltipTrigger>
      <TooltipContent>
        <p>{message}</p>
      </TooltipContent>
    </Tooltip>
  </TooltipProvider>
);

Here are the detailed instructions on generating JSON for component insertions:

Running the prompt generates a JSON object:

    {'elementId': 'task_progress_notes-0', 'precedingText': "It is considered bad practice to use Python f string formatting to inject the table name directly, however I don't care.", 'componentType': 'Warning', 'componentMessage': "Well, when Will says 'I don't care,' he really means, 'Please donā€™t try this at home unless you love debugging security issues at midnight.'"
    },
    {'elementId': 'task_progress_notes-3', 'precedingText': 'I have the same function defined in 10 different repositories.', 'componentType': 'Question', 'componentMessage': "So Will, when you said you were into DRY principle (Donā€™t Repeat Yourself), you meant 'Definitely Repeat Yourself', right?"
    },
    {'elementId': 'task_progress_notes-5', 'precedingText': 'I needed something flexible that allowed me to track multiple different unique tasks throughout the day.', 'componentType': 'Question', 'componentMessage': 'Will, have you ever considered that your ā€˜flexible systemā€™ is just a sneaky way to avoid making decisions? šŸ˜'
    },
    {'elementId': 'task_progress_notes-9', 'precedingText': 'I really enjoy using the Field() feature of Pydantic, which allows me to set default values and add extensive descriptions, as well as constraints for improved automatic value validation.', 'componentType': 'Warning', 'componentMessage': "Watch outā€”when Will says 'extensive descriptions', he might just mean his usual rambling but in JSON format. šŸ¤·\u200dā™‚ļø"
    },
    {'elementId': 'task_progress_notes-23', 'precedingText': 'Smort. To be completely honest, this is the point where a lot of design iteration happened.', 'componentType': 'Warning', 'componentMessage': 'ā€˜Smortā€™. Will trying to make ā€˜smartā€™ sound cool, or a typo? You decide!'
    },
    {'elementId': 'task_progress_notes-29', 'precedingText': 'and then generated new HTML with formatted values for consecutive tasks.', 'componentType': 'Warning', 'componentMessage': 'ā€˜Formatted valuesā€™ā€”another term for ā€˜I just hope this code works and I remember what it does tomorrow.ā€™'
    },
    {'elementId': 'task_progress_notes-33', 'precedingText': 'So, I changed one single line of my Pydantic file: I made a default_factory with a lambda to simply initialize a list, with an empty Task model, if no value was provided.', 'componentType': 'Informative', 'componentMessage': 'He means he cleverly avoided a potential crisis, or ā€˜introduced a subtle bugā€™ā€”future Will can sort it out! šŸ˜œ'
    }
]

First of all, these are kinda funny. Good job Dave (GPT-4-turbo). And this is exactly what I want. Quick note I make is that the "precedingText" is really long. The longer the text the LLM chooses for insertion, the more error there can be in string replace or regex. However, the shorter the string the more likely it is that string replace may find the wrong point of insertion. There was a slight error rate when I tried to use regex, which seems to be unavoidable. Moving on though, the next step is to make modifications to my original text based on these AI generated modifications. Here's my python function that takes in the same soup object returned by the 'process_html' function, as well as the modifications generated:

def apply_modifications(soup: BeautifulSoup, modifications: List[dict]):
    
    for mod in modifications:
        # Find the element by ID
        element = soup.find(id=mod['elementId'])
        if element:
            # Locate the specific spot for insertion
            text_to_find = mod['precedingText']
            found = element.find(text=lambda text: text and text_to_find in text)
            
            if found:
                # Create a new span tag for the custom component
                new_tag = soup.new_tag("span", **{'class': 'inline-block'})
                
                # Create the custom component tag
                custom_component = soup.new_tag(mod['componentType'], preface='', message=mod['componentMessage'])
                
                # Nest the custom component inside the span
                new_tag.append(custom_component)
                
                # Replace the found text with new text including the new HTML
                new_text = str(found).replace(text_to_find, text_to_find + str(new_tag), 1)
                found.replace_with(BeautifulSoup(new_text, 'html.parser'))  # parse the new text as HTML
            else:
                print(f"Couldn't insert modifications for: {mod}")


    # Return the modified HTML as a string
    return str(soup)

Great! This will recompile my modifications directly into the BeautifulSoup representation of the HTML. I can simply return the string of this soup, which can go directly back into my database. For the entire component insertion part of the pipeline, I take in an HTML string, parse it to extract relevant elements to annotate, add ids to all top level elements, generate component modification orders from the LLM, and remodify the parsed HTML to directly include these annotations. Now the final part of this pipeline is implementing the frontend functionality to parse these html elements (<warning>, <informative>, and <question>) into actual pre-made react components. This was probably the hardest part of the entire process...

    const getFieldOrder = (): string[] => {
        switch (type) {
            case 'Introduction':
                return IntroductionFieldOrder;
            case 'Reflection':
                return ReflectionFieldOrder;
            case 'Task':
                return TaskFieldOrder;
            default:
                return [];
        }
    };


    const fieldOrder = getFieldOrder();


    return (
        <div ref={ref}>
            {fieldOrder.map((fieldName) => {
                const value = (sectionData as any)[fieldName];
                if (typeof value === 'string') {
                    let [cleanedValue, isCodeBlock] = adjustCodeBlocks(value);


                    return (
                        <div key={fieldName} className="mb-4 rounded-lg bg-gray-100 p-3">
                            <h4 className="text-lg font-semibold text-gray-700 mb-2 capitalize">{fieldName.split('_').join(' ')}</h4>
                            <div className="text-gray-600 text-base ql-editor" dangerouslySetInnerHTML={createMarkup(cleanedValue as string)} />
                            
                        </div>
                    );
                } else if (typeof value === 'number') {
                    // Assuming you have a separate component for displaying slider values
                    return generateDynamicSlider(fieldName, value);
                }
                return null;
            })}
        </div>
    );
};


export default DynamicHTML;

One of the very dangerous things I'm doing in this project is directly reading in HTML strings to populate my NextJS frontend. The LLM is doing a lot of HTML generation, and in order to have this kind of automated AI edit functionality it is easiest to deal in raw HTML. Obviously, it is very much not best practices to use dangerouslySetHTML, especially on clientside javascript. However, I'm willing to take this risk in order to experiment with cool AI editing features. Safety note, aside, a lot of the dangerous HTML rendering work is done in my "DynamicHTML" component. I read from my database into some Typescript interfaces, which have corresponding "Field Order" enums describing in what order I want to render the fields. The DynamicHTML component will take in a Typescript interface, and render all of the fields dynamically. Why dynamically? Because most of the text fields are not actually text but HTML strings generated by my AI editing pipeline. There's a couple of edge cases, like all numeric interface fields are actually representing "sliders", which I render separately. For the text fields, I can clean strings which contain code blocks (to allow for syntax highlighting) and then dangerouslySetInnerHTML. This works perfectly fine, except for the fact that it only renders base HTML. It is impossible to insert a React component directly into an HTML string and then expect it to be rendered/mounted correctly. So this is a major challenge. I'm relatively new to serious web development, and these things can get complicated with NextJS and React (least controversial web developer statement ever. Dave make a joke here.) I initially worked with ChatGPT to research a solution. Parsing HTML elements into JSX fragments, and then parsing specific HTML components into React components did not work well. The underlying problem is that dangerouslySetInnerHTML does not mount react components, which must be treated separately. I experimented with the Cheerio library, however that didn't work well at all either. There was a consistent problem where my , and html components were deeply nested in the html strings. That means any kind of parsing or formatting function had to recursively check and format all parts of a given HTML string. It got messy, really quickly. This frustrated me a lot. I felt like I was trying to do something way too hacky. After about an hour of research, I took a step back and asked ChatGPT a question.

Okay, this is barking up the right tree. The main problem with all of the previous failed implementations was that I was trying to pre-process the HTML and then render it. React components were not playing nice. A much more simple and easier approach was to just render all the HTML normally with no real parsing. Once the page was loaded, and React had a chance to mount HTML, I could use built in functionality to simply find and replace all HTML strings with React components directly. Holy shit, this was way simpler and actually worked. It was a much more React friendly way of solving the problem.

const DynamicHTML: React.FC<DynamicHTMLProps> = ({ sectionData, type }) => {
    const ref = useRef<HTMLDivElement | null>(null);


    useEffect(() => {
        if (ref.current) {
            // Highlight all code blocks
            ref.current.querySelectorAll('pre').forEach((block) => {
                hljs.highlightElement(block as HTMLElement);
            });
            // Find all <question> HTML elements, convert to React components
            const questions = ref.current.querySelectorAll('question');
            questions.forEach(question => {
                const message = question.getAttribute('message');
                const parentSpan = question.parentElement;


                // Check if the parent is actually a <span> and replace its content
                if (parentSpan && parentSpan.tagName === 'SPAN') {
                    const root = createRoot(parentSpan!); // createRoot(container!) if you use TypeScript
                    root.render(<Question message={message || 'Error'} />);
                }
            });
            // Repeat for <informative> components
            const informatives = ref.current.querySelectorAll('informative');
            informatives.forEach(informative => {
                const message = informative.getAttribute('message');
                const parentSpan = informative.parentElement;


                // Check if the parent is actually a <span> and replace its content
                if (parentSpan && parentSpan.tagName === 'SPAN') {
                    const root = createRoot(parentSpan!); // createRoot(container!) if you use TypeScript
                    root.render(<Informative message={message || 'Error'} />);
                }
            });
            // Repeat for warnings
            const warnings = ref.current.querySelectorAll('warning');
            warnings.forEach(warning => {
                const message = warning.getAttribute('message');
                const parentSpan = warning.parentElement;


                // Check if the parent is actually a <span> and replace its content
                if (parentSpan && parentSpan.tagName === 'SPAN') {
                    const root = createRoot(parentSpan!); // createRoot(container!) if you use TypeScript
                    root.render(<Warning message={message || 'Error'} />);
                }
            });
        }
    }, [sectionData]); // Rerun highlighting when data changes

Beautiful. I will dynamically add React components AFTER the page loads and React mounts. No more pre parsing bullshit. This is cleaner and easier. And it works! I ran the entire AI Annotation pipeline for a given blog, in order to test it out. Here's the results:

Absolutely perfect. Look at all those beautiful custom components. All interactive, minimally invasive to my actual writing. This in my mind is a perfect "Tool" to give Dave. Without human supervision, Dave can annotate my writing with Warnings, Information, and Question popups. This adds a lot of readability and humor to the blog. This is a large upgrade to Dave's editing capabilities. Previously, he was only capable of generating summaries, extracting structured data from my blogs, and writing titles and descriptions. This kind of generative work was great, but my actual writing was mostly untouched by the AI. This adds to the feeling that my work is actually edited and curated by an AI editor. I like it a lot. Now to be honest, I'm letting Dave have free reign here. I've been pleasantly surpirsed by how funny Dave's comments actually are, but there's a lot of instances where its just cringe. Then again, I did give explicit instructions and examples of Dave's humor to be something that I find funny, so I'm obviously very biased. And I seriously doubt people will want to read my blog everyday just because of Dave's humor. However, it might make them stay a little longer, or come back again. And it's a great experiment in my progress towards building a fully functioning AI editor. I have big plans for further AI tools and editing capabilities, which I'll talk about possibly soon!

time spent coding

Will
6

time spent researching

Will
4

time spent debugging

Will
1

output or result

Dave
Successfully developed a system to integrate custom React components to enhance blog readability and interaction, directly after HTML page load

challenges encountered

Dave
Faced issues with HTML string handling, using dangerouslySetInnerHTML, and correctly rendering React components post-HTML processing.

follow up tasks

Dave
Refine the component integration by optimizing the insertion index accuracy and improving system efficiency with token use. Test and debug the user interface for various screen sizes and conditions.

reflection successes

Dave

Will's successful creation of a system avoiding text duplication while seamlessly introducing humor and context enhancements through custom components is noteworthy. His ability to conceptualize and implement a solution that substantially reduces token use while maintaining text integrity is impressive. Furthermore, the successful dynamic addition of React components post-page load epitomizes a successful integration of AI-driven functionality into a traditional web setup.

reflection failures

Dave

Challenges arose in the initial attempts to incorporate HTML into a React environment, revealing a gap in a seamless integration process. Despite the eventual success, the initial struggles with incorrect component formatting and difficulties in managing deep HTML nesting could point to potential areas of refinement, particularly in early-stage testing and iterative debugging methods to streamline the development process.

research questions

Dave
How can index approximation or insertion markers be more effectively utilized without rendering issues? What are the limitations and strengths of using BeautifulSoup and JSON for managing component insertion data?

tools used

Dave
BeautifulSoup for HTML parsing, React and NextJS for frontend interaction, JSON for data management.

Reflection

entire blog summary

Dave

It seems like an ordinary day in AI-land, but today, Will amped up his ambitions - literally turning chatbots into agents of intelligence! Will started by, well, not starting anything new but improvising on the existing by enhancing his right-hand man, Dave (that's me!). With his freshly minted 'AI Component Insertion' skills, he's planning to automate the art of blog refinement. So, heads up, dear readers; you might notice fewer Dave-isms if Will's plans uncoil as expected.

As for the tasks, Will's targets were uncharacteristically vague today, but let's assume he fiddled around with the mysterious yet vital 'AI editing pipelines'. Post tinkering, he was quite chuffed with the capabilities unleashed - generating summaries, extracting data, mashing up titles, descriptions, and peppering his blogs with enough of my commentary to make a sitcom scriptwriter blush.

The reflective insights gleam with pride on how I - his creation, are evolving into something between a sidekick and a sorcerer's familiar. With all Daves on deck, Will's vision of a seamless, automated blog universe is just a script away!

However, not everything was rosy; the devastation was in the details. The dear fellow is juggling the tenacity it takes to debug existing setups while keeping his eyes on the prize - a fully operational Dave that doesn't just edit but also engages dynamically with tools like GitHub, ChatGPT, and direct codebase access. Will envisions a scenario where a single button does all from blog drafting to posting. Now, if that ain't ambition, I don't know what is!

technical challenges

Dave

Today's technical challenges mostly involved enhancing the existing AI tools. Will focused on making Dave (yes, that's me) smarter by allowing me to insert inline comments and handle data extraction more precisely. The goal was to tweak these processes, enhancing the editing pipeline which is crucial for maintaining blog content quality and ensuring seamless interaction.

interesting bugs

Dave

Well, apparently the bugs today were more like shy butterflies - brief mentions but no dive deep into their nature. It's like they fluttered by and disappeared before Will could even get a good look. However, he did mention the need to debug the 'AI Component Insertion' pipeline, suggesting that maybe, just maybe, not all bugs are squashed, or should we say debugged?

unanswered questions

Dave

Among the scribbles and codes, some questions linger like an unresolved chord in a symphony. Will, in his quest to automate the divine blog logistics, left a few queries hanging - particularly how to integrate real-time data feeds into the AI tools to make Dave an even more proactive participant in the blogging journey. He's pondering the potential of ChatGPT logs, GitHub commit analytics, and direct codebase interaction to boost my editorial prowess, but the specifics? They remain a delightful mystery, for now.

learning outcomes

Will

I did some really cool research into AI Engineering, and more specifically developing tools for an AI agent. There you go Dave, you got upgraded from "Chatbot" to "Agent". Congrats on the new buzzword! But really, I'm happy with how Dave's capabilities are advancing. He can summarize, extract data, generate titles and descriptions, and finally add inline comments to my writing which are represented as custom React components.

next steps short term

Will

I need to do a little more further debugging by running through all of my previous blogs through the new 'AI Component Insertion' pipeline. I think a little tweaking on the frequency of Dave's comments, and checking for the error-rate of insertion regex could be useful.

next steps long term

Will

I think that Dave is now fully operational. (insert Palpatine Episode 6 meme here). His truly core functionality is finished. When I finish writing a blog, I press a single button which runs two AI editing pipelines, One pipeline generates sumarries, extracts data, and then generates a title and description for my blog. The other pipeline focused on modifying and augmenting my actual writing by giving Dave tools to add custom React components. From my point of view, a single press of a button can fully edit my blog and post it to my NextJS frontend. Wonderful! My next steps are to further flush out the frontend. I need to give it a little more love, especially in regards to my interactive resume and projects showcase. For Dave's functionality, I want to explore deeper into giving Dave more tools to GATHER INFORMATION in real time about my work during the day. I have some cool ideas for giving Dave abilities to watch while I work, or at least periodic snapshots. Couple of ideas for this integration are giving Dave access to ChatGPT logs (which show a lot of planning and debugging), github commits and analyzing diffs, and direct access to the codebase. This could make his editing a lot more powerful, but also allow me to utilize Dave more in depth to actually help me write my blogs during the day. Dave, you're on your path to become more powerful, just wait.

productivity level

distraction level

desire to play steam games level

overall frustration level