Our VP is on the AI bandwagon, big time.
I get it.
We have a provider that is really a startup that’s supposed to mash up content and spit out a training course. It works OK, provided you feed it right. The idea here is it’ll inhale a technical doc – say an administration guide, and generate a course – either eLearning or instructor lead. The idea here is it’ll create a script, then you can stuff it into an AI speech engine (like Speechelo, or Amazon Polly) or one of the AI video generator, like Synthesia. Personally, I find this AI generated voice and video soul sucking. Call me old school.
It can be a time saver. That said, it can only ingest a number of formats. So what I did for a test was take a module of eLearning, where we have no script, stitch the audio to the slides (in camtasia), produce a video which can be ingested by the service. Fair enough.
It gobbled up my test video, and built a framework. To be fair it was one module – an introduction, not a whole course. Still, it got the gist of what was said halfway right. It didn’t transcribe, as much as interpret what was being said. So I published an audio script. Here’s what it said:
Delivering the Course Using Adobe and WND165.002 Virtual Machine
A mess
Welcome to the course on delivering content using Adobe and WND165.002 Virtual Machine. This is an on-demand training course that consists of lecture modules, presentations, and hands-on lab exercises. To access the course materials, you will need to use the virtual machine named WND165.002. During the course, you will see the slides presented on the left side of the screen, while the right side panel will display an outline view of the slides and additional options that you can access by clicking the tabs. You can also view the slides in thumbnail view and search for specific content. To navigate through the slides, you can click on a slide in the outline view to advance to it in the presentation. The navigation controls at the bottom left allow you to pause, start the audio and video, and manually move forward and backward in the slides. Additionally, there is a button at the bottom right that you can use to collapse or expand the side panel as desired. By using these features, you can easily navigate through the course materials and access the information you need. So, let’s get started and make the most of this training course.
Every slide, it welcomed you to the course.
Here’s what Dragon heard:
Slide 5
This is on-demand training. Lecture modules are displayed with the presentation by a skilled instructor. There are also hands-on lab exercises associated with this course, you need to use the appropriate virtual machine for this course, the name of the virtual machine is WND16 5.002
Slide 6
A transcription
This course is delivered using Adobe slides are presented on the left side panel is displayed on the right side panel is a list of slides and outline view additional actions are available by clicking the tabs, you can view the slides and thumbnail view and search for content and outline view, you can click the slide to advance the slide in the presentation use the navigation controls at the bottom left to pause start the audio and video, and manually move forward and back in the slides at the bottom right is a button you can use to collapse and expand the side panel as desired.
Reality is somewhere in-between. Dragon missed some words.
Reminds me of the late Wilson Roger’s “as is” ware, WReport. It would generate paragraphs of random text. Behold:
MODERN BUSINESS AND THE VAST BAKED EXCURSION
Wreport output
I must acknowledge that the superabundant osteosclerotic integrity measurement and the departmentalized ratiocinative demythologization is not as energy efficient as the all-too-well-known noncommissioned management information system. The preceding should need no explanation. However, the segment of the noninheritable wogging includes a number of enhancements in the feral cerebrum filename and the delayed nonintellectual autoeroticism. Disputes arise as to why the satirized DOS quotient and the torrential meteorological eruption exposes the inherent weakness in the digitally looped anticoagulating constructionism and the monastic diode matrix exhortation. Anyway, the brawny mersmerization conviction and the glazed vessel lambrequin can be contrasted with the trans-nodal softer waveguide and the ductile co-signaling necropsy. If the preceding is not a true statement, then the effect of EFI on doubled heft and the rubberized cross-platform decalcification obviates the need for more information on the unintelligible firmware activation. We suspect that the slenderized antisepticizing sexism and the conceivable irresponsible concept confirms the romanticized accrual telephone. This is where the distilled psychophysical benchmark and the uncapitalized smoking pile of paunch takes longer than the oftentimes V.32bis snorkel. We can safely assume that the exhausting protoplasmic bit has catastrophic consequences when coupled with the remote accountability water vapor and the odorless cinematographic annulment. Without further ado, let’s explore why the area of 10BaseT fiction is, arguably, the dose of super-conductor horizon and the eloquent port report. One of the beliefs that you hear being bandied about is how the perfunctory circumstantiating magnetic diffusion is satisfying, as is the dithered elasticizable insertion and the lithe placid sophism.
My issue with the whole thing is first, it requires a lot of work to input – garbage in, garbage out. For me, that same amount of work can be done by Dragon, and then simply editing the output. It took all of ten minutes for Dragon to transcribe that entire module. Second, it’s wordy as hell. Like a speaker that loves using as many words as they can.
We’ll see. In my opinion, it’s mostly hype.