Write the Docs: Ashleigh Rentz — The technical challenges of serving docs at Google’s scale

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Ashleigh started at Google in 2004 as a data cen­ter hard­ware tech­ni­cian. In 2010 she got involved with a team of tech writ­ers work­ing on API doc­u­men­ta­tion. The story she told was of how Google’s CMS came to be.

Google now has so many devel­oper prod­ucts it fills a peri­odic table. Literally. They made one.

Scaling prob­lems can show up so grad­u­ally, you barely notice them until you’re already in big trou­ble. This hap­pened for Google with their CMS. What worked in 2005 was hor­ri­bly bro­ken by 2010.

In 2005 Google had just hired Chris DiBona as the head of Open Source at Google. He started by focus­ing on get­ting Google to con­tribute more to Open Source projects. They cre­ated code.google.com as a place for them to share code. When they launched this it was an intro­duc­tory place to put some code. They started with doc­u­men­ta­tion around their 10 APIs at the time. It’s build using EZT, or EaZy Templating. It’s a sim­ple markup lan­guage you can use to define build objects in your documentation.

Google’s code site was opti­mized for small files, about 256K, and cached things in mem­ory. This grew from Google’s issues scal­ing the hard­ware impacts of their con­sump­tion at the time. It was a time when a giga­byte of stor­age was still a lot.

In 2006 Google launched Project Hosting. In the days before Github this mean that they had a place to host and share open code projects.

By 2010 the builds for code.google.com started run­ning in to seri­ous issues. New docs weren’t going live and they were hit­ting con­sis­tent errors. Files were tak­ing almost 45 min­utes to build. This meant that a tech writer work­ing on a doc­u­ment had to give them­selves a 45 minute lead time. A new project doc­u­ment set to launch at 2pm had to be filed at 1pm. Any typo or issue in the doc sub­mit­ted meant another 45 minute delay. All of that was com­pounded by the fact that each build would fail with a typo in any new doc. One doc with an issue caused prob­lems with new docs across all services.

There were other fail­ures, too. Outside of writer mis­takes they hit issues with disk I/O. This caused them to push the build cron jobs back to once every 2 hours. The fun part of that was that to pull any tech­ni­cal doc­u­men­ta­tion down from the web also took 2 hours. Picture how awe­some that is when you acci­den­tally pub­lish some­thing. This 2 hour turn around time just didn’t work for how Google wanted to pub­lish tech­ni­cal content.

They faced a choice between a band-aid fix and push­ing the reset but­ton on their CMS. They decided to develop a CMS that was actu­ally meant for devel­oper doc­u­men­ta­tion. A team of peo­ple worked on this new site and the new CMS. The prod­uct of this was developers.google.com.

Google’s new devel­oper site as built dif­fer­ently. Gone were the days of hav­ing to do every­thing man­u­ally. Since Google now had App Engine they were able to lever­age this as the plat­form from which they could build docs. Using Django non­rel so that they could work with the Django frame­work with the non-relational data­base struc­ture of App Engine.

By mov­ing the CMS away from EZT they avoided rely­ing upon a site-wide build. Now they could build only what the writer asks for, when the writer asks for it. Syntax errors now returned in 60 sec­onds, not 60 min­utes. And, your syn­tax errors don’t affect the sys­tem, just you. One down­side to no site-wide builds is that when changes (for exam­ple, with pric­ing) hap­pen out­side the doc­u­ment tree Google has to man­u­ally rebuild the doc­u­ment to reflect the new pric­ing structure.

In late-2011 they started the process of migrat­ing over to the new site. With 80,000 doc­u­ments that’s a slow process. The prob­lem is that it split their code doc­u­men­ta­tion across 2 sites. It was a short-term issue that would even­tu­ally be fixed. The goal was to com­plete the move by May 2012 and all went smoothly.

Write the Docs: James Socol — UX and IA at Mozilla Support, and Helping 7.2 Million More People

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

James started things off after lunch. He works on support.mozilla.org and the Mozilla Developer Network.

James started talk­ing with a short his­tory of SUMO, which Michael talked a bit about yes­ter­day. Through a series of redesigns they got to the design they now have. In that process they worked through solv­ing the prob­lems of ear­lier iterations.

When they tried solv­ing this prob­lem they lacked a good bit of data. What they did have, though, showed very low “help­ful” scores on arti­cles. They also had high exits from searches, re-searches, and high bounce rates.

One of the first things they did was have some­one ded­i­cated to the web side of sup­port. They started with an heuris­tic eval­u­a­tion and worked with a user expe­ri­ence expert on improv­ing things. One thing they dis­cov­ered in this was that if peo­ple got to the right arti­cle the help­ful­ness scores were very high. Outside of that, though, the scores tanked. They knew they had an infor­ma­tion archi­tec­ture problem.

They set out to ana­lyze the cur­rent infor­ma­tion archi­tec­ture of the site. The first step was the man­u­ally look through the docs. They looked at what arti­cles they had, where they were linked from, and the tax­on­omy that existed. To help with this they did a card sort, a means to guide users to gen­er­ate a cat­e­gory tree.

With the map they had from the card sort they used Treejack and lim­ited the user test­ing to just dis­play­ing the title of docs. The goal for users was then to say, “This is where I will find my answer.” With their cur­rent archi­tec­ture of the time the suc­cess rates were as low as 1%. That’s bad. With that, though, they now had data. They had some­thing they could work with and could opti­mize. What they found was inter­est­ing. Some arti­cles were miss­ing, some were badly named, and some had other issues.

Their user expe­ri­ence peo­ple had a few ideas. They pro­posed and tested a few solu­tions. This took their suc­cess rates in user tests up to highs of 92%. One task specif­i­cally went from a 1% suc­cess rate all the way up to 86%. With Treejack they were able to run all these tests by focus­ing just on the titles. It meant they could test quickly with­out hav­ing to rearrange or rewrite all of their docs.

At the end of things 10% more peo­ple were com­ing to the site and find­ing their answer. They tracked this by graph­ing the rate of “help­ful” scores on doc­u­ments. That 10% meant 7.25 mil­lion more peo­ple a year found the solution.

Write the Docs: Heidi Waterhouse — Search-first documentation: tags and keywords for frustrated users

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Heidi wrapped up talks just before lunch. She talked about search-first doc­u­men­ta­tion and how search-first writ­ing serves users. As she put it, lots of our users are com­ing to doc­u­men­ta­tion angry. They have prob­lems they need solved and can’t find the answers.

Heidi started in the mid-90s with table of con­tents focused doc­u­ments. These are won­der­ful in that they’re orderly, lin­ear, search­able, and often indexed. But, they are also rigid, lin­ear, over-described, and leave users out of the process. They ignore the fact that a good doc­u­ment doesn’t just include every­thing in the world.

Mid-career she moved on to task-based doc­u­ments. These are great in how they take in to account the goals of users. While they’re more mod­u­lar they can be too chunky and hard to dis­cover. There’s no path through these doc­u­ments. You can hop from one task to another but the over­all pic­ture and flow becomes dif­fi­cult. Task-based doc­u­ments are also rigid about the infor­ma­tion type they require.

More recently Heidi’s seen guer­rilla doc­u­men­ta­tion appear­ing. This is largely user-created, rel­e­vant to real needs, and may sur­prise you. The down­side is that the doc­u­ments can get stale, they’re uncon­trolled, and they require leav­ing the ecosys­tem of the prod­uct. The sig­nal to noise ratio can also be hard to determine.

Heidi’s pro­posal is that we take the best aspects of each of these mod­els and cre­ate a new model. The model of search-first doc­u­men­ta­tion. We’ll end up with some­thing respon­sive to user needs. It will be doc­u­men­ta­tion that is self-triaging and is born search­able. Ideally the terms used in this type of doc­u­men­ta­tion comes from your users. It’s not impor­tant what you call a fea­ture, it’s impor­tant what users call it and how they’ll search for it. For exam­ple, “blue screen of death” appears nowhere in Microsoft’s doc­u­men­ta­tion but we all know what it means.

To make this type of doc­u­men­ta­tion hap­pen you first need to gather data. Using tech sup­port, user com­mu­ni­ties, and Stack Overflow you can get all the info you need. Second, you’ll have to write the docs and keep pub­lish­ing all the time. Writing pithy docs will help you focus on respond­ing to a spe­cific ques­tion. Plenty of these ques­tions won’t be answered by a sim­ple task.

Write the Docs: Jennifer Hartnett-Henderson — Sketchnotes: Communicate Complex Ideas Quickly

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Sketchnotes are a great way to com­mu­ni­cate com­plex ideas very quickly. Mike Rohde defines sketch­notes as, “rich visual notes cre­ated from a mix of hand­writ­ing, draw­ings, hand-drawn typog­ra­phy, shapes and visual ele­ments like arrows, boxes and lines.”

Jennifer’s been able to com­bine her art inter­ests with her work­ing career by per­fect­ing how she does sketchnotes.

Sketchnotes work due to dual cod­ing. If you com­bine the visual with the writ­ten it increases people’s abil­ity to remem­ber infor­ma­tion. It’s impor­tant, though, to not think of sketch­notes as art. They’re not art and, instead, are a means of com­mu­ni­ca­tion. They’re just notes. Sketchnotes are about com­bin­ing shapes and lines in to a form that makes sense.

There are a few resources to help get up to speed with sketch­notes. Mike Rohde sells The Sketchnote Handbook. Eva-Lotta Lamm also pub­lishes sketch notes from con­fer­ences all over. The tools you use aren’t as impor­tant as the prac­tice. You can use dig­i­tal or paper, it’s about what works best for you.

Write the Docs: Tim Daly — Literate Programming in the Large

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Tim’s talk required some pre­vi­ous knowl­edge of Donald Knuth. If you don’t know who that is Wikipedia has a good sum­mary. Tim’s back­ground is largely with the Axiom alge­bra sys­tem.

Tim talked about how back in the 1970s you pro­grammed in very small files. Nothing could be more than 4K so you ended up with these trees of tiny bits of code and relied upon build sys­tems to put it all together.

IBM’s stan­dard for doc­u­men­ta­tion requires things to be writ­ten at the 8th grade level. This is under­stand­ably quite tough when you’re doc­u­ment­ing com­plex algorithms.

Tim knows what code he wrote years ago does. He knows that if he takes it out things will break. The prob­lem is he doesn’t know why he wrote it in the first place. This was tough when he faced the task of work­ing with 1.2 mil­lion lines of uncom­mented code. The 8th-grade level doc­u­men­ta­tion didn’t really help. In the early projects he worked on they didn’t write down the “Why” of code. Turns out that’s really, really important.

Tim sought a tech­nol­ogy that would let him cap­ture the “Why” of code. This, essen­tially, is lit­er­ate pro­gram­ming and stems from Donald Knuth, the writer of LaTeX, METAFONT, and many more pieces of code. A lit­er­ate pro­gram should pass the Hawaii Test. This is where you take the pro­gram, print it in book form, give it to a pro­gram­mer for a cou­ple weeks, send that per­son to Hawaii. When they’re back they should be able to work on and mod­ify the orig­i­nal code as well as the orig­i­nal pro­gram­mer. If you have that, you have a lit­er­ate program.

The book form of a lit­er­ate pro­gram includes all the nec­es­sary source code to build a sys­tem along with all the doc­u­men­ta­tion and nar­ra­tion required to under­stand that system.

Tim argued that pro­gram­ming teams need an Editor in Chief. No one should be able to check in code with­out this EIC affirm­ing that the code has an expla­na­tion along with it. The EIC gets between devel­op­ers and the repos­i­tory and says, “We’re writ­ing a book about this code. You can’t check in code with­out the code and the story about what the code does match­ing.” When you have the expla­na­tion along with code you can com­pare a programmer’s stated goal with the real­ity of what the code does.

Companies depend upon pieces of soft­ware that for­mer employ­ees cre­ated. If you don’t under­stand that code you end up rewrit­ing it. By ensur­ing our pro­grams are lit­er­ate pro­grams we pro­vide for stronger future proof code.

Write the Docs: James Tauber — Versioned Literate Programming for Tutorials

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

James started things off after the break talk­ing about a com­bi­na­tion of ideas around lit­er­ate pro­gram­ming and ver­sion con­trol. He pitched it as a socratic talk that would pose more ques­tions than answers.

James comes from a back­ground of more than 20 years involve­ment in Open Source projects. He’s the CEO and founder of Eldarion which builds web­sites in Python and Django.

In June 2003 James posted to the Python mail­ing list about how fea­ture struc­tures could be imple­mented in Python. He worked up an exam­ple that some­what like nar­ra­tive pro­gram­ming. A method in which you explain to humans what the code is doing while you are writ­ing the code.

Much of the talk went over my head so these notes aren’t the great­est. The gist seems to be that lit­er­ate pro­gram­ming is not a means of redo­ing how we do doc­u­men­ta­tion but, rather, a way we rethink pro­gram­ming. Writing the code and describ­ing the code ought to be part of the same process.

Write the Docs: Daniya Kamran — Translating Science into Poetry

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Daniya started by stat­ing that she is not strictly a doc­u­men­tar­ian. She’s a trans­la­tor and inter­ven­tion­ist. She turns crazy black hole-style con­cepts in to sim­plis­tic solu­tions. She focuses on ques­tions like, “How do we turn around how peo­ple view nutrition?”

Documentation is an inter­ven­tion. You’re cre­at­ing an inter­ven­tion from the per­spec­tive of the user. It changes the way they relate to a prod­uct and their train of thought when using the product.

The tech­niques and processes she pre­sented are not a check­list. They’re strate­gies you can employ selec­tively to reach your goal. Her goal is to give you a poetic lens for your work. If you don’t like poetry, sci­en­tific doc­u­ments, or want her to just show you how to write the docs then you’re not going to like this.

Poetry is very good at immor­tal­ity. When you write you can­not write think­ing that what you’re writ­ing is never going to be seen again. If we assume doc­u­ments are tem­po­rary it will come across in our writ­ing. The user can tell. Poets write to tran­scend time. As pro­gram­mers and as doc­u­men­tar­i­ans we don’t do this. Even if no one under­stands your work they should at least know that you wrote it. Write think­ing that you’re a bad ass. Here’s Daniya’s guide­line: Good writ­ing will get replaced. Bad writ­ing will get replaced imme­di­ately. Epic writ­ing will be edited. Be epic.

Poetry is always about a dilemma. There’s some­thing about to hap­pen that you can’t quite fig­ure out and need to solve. We assume that peo­ple know why they are read­ing the docs. There needs to be a con­text for what is being cre­ated. Docs, ulti­mately, answer the ques­tion, “What do I do?” In many, many docs this is not obvi­ous. Your worst enemy is not a com­pet­ing doc­u­ment; but a lack of ini­tia­tive. Someone is going to read what you wrote and noth­ing will hap­pen. In mak­ing dis­so­nance obvi­ous we cre­ate a sense of urgency, increase user auton­omy, and pro­vide a call to action.

Daniya says we should be biased. There is such a thing as a point of view. Scientists are very aller­gic to a point of view; they don’t like it at all. We asso­ciate objec­tiv­ity with intel­li­gence, even though that’s not always the case. The per­son read­ing your docs is com­ing to you because you’re the expert. They don’t want to do the think­ing that you did. They want the answer. There’s a rea­son opin­ions flow and have an impact. A lot of it just has to do with assum­ing you are the expert and writ­ing accord­ingly. Influence your read­ers. The goal is to be holis­tic. Encase your opin­ion in objec­tive analy­ses as well as other opin­ions. Having the point of view pro­vides the reader con­text. They can under­stand that you’re human.

Another thing sci­en­tists are aller­gic to is error. Poetry deals with it in a fas­ci­nat­ing way. In some ways poetry is all about error; bad things hap­pen all the time. As Daniya said, “Why is epic poetry still epic when every­thing is going wrong?” Poetry deals with things as part of a cycli­cal process. When you view things as part of a process it removes a large aspect of the neg­a­tiv­ity. Daniya also phrased it as, “a lack of error is very con­trary to human abil­ity.” An error is not a con­se­quence, it is amend­ment to the process and a part of the process. It’s tem­po­rary and pro­vi­sional and will, even­tu­ally, be edited and improved.

Poetry is very, very good at reit­er­a­tion. What reit­er­a­tion allows is for us to remem­ber the pur­pose after every major turn­ing point and com­plex­ity. People should not for­get why they are read­ing what they are read­ing. Reiteration allows them to con­nect the cur­rent com­plex­ity back to the orig­i­nal pur­pose. Periodically bring­ing back con­text allows a reader to never lose sight of what you’re try­ing to do.

Metaphors are, in some ways, the most impor­tant point. All the evi­dence you’ll be draw­ing from to write your docs already exists. We merely rearrange bits of infor­ma­tion in new and inter­est­ing ways. All we are doing is cre­at­ing metaphors. They allow us to rearrange the pat­terns of our mind. We’re mak­ing remote asso­ci­a­tions between things we didn’t think were related at all. The reader fills in these pat­terns and asso­ci­a­tions and, thus, con­nects more deeply to your writ­ing. Instead of push­ing some­thing out of the page the reader is pulling it out of themselves.

As doc­u­men­tar­i­ans we are adding to how peo­ple view com­mu­ni­ca­tion. The way you ensure that your doc­u­men­ta­tion is eter­nal is to make sure as many peo­ple as pos­si­ble can read it, lever­age it, and con­nect with it. The bot­tom line in all this is ele­gance. Can you make your doc­u­ments ele­gant? If you can cap­ture this in your words and in to your page then you have done every­thing you can.

Write the Docs: Noirin Plunkett — Text Lacks Empathy

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Noirin opened the sec­ond day at Write the Docs by stat­ing that we are basi­cally hair­less mon­keys. We’re inher­ently emo­tional people.

Ein + füh­lung: The German root of empa­thy. Our abil­ity to com­mu­ni­cate, and to do so with empa­thy, is what helps us cre­ate these social con­nec­tions. Facial expres­sions, body lan­guage, and more help us cue these reac­tions and connections.

Text, though, can remove emo­tion from our com­mu­ni­ca­tion. We lack the facial expres­sions and more sub­tle indi­ca­tors that help us in per­son. We have a ten­dency to fill emo­tional voids with neg­a­tive emo­tions. This is par­tic­u­larly true in high stress situations.

The rapid­ity with which we can com­pose dig­i­tal text is not ideal if we’re try­ing to solve com­plex prob­lems. What works for in-person con­ver­sa­tion does not work as well in a text format.

A lot of the time we don’t write. In email or in doc­u­ments we aren’t invested in we, essen­tially, speak with our fin­gers. For peo­ple we have a con­nec­tion with, that’s fine. When writ­ing doc­u­men­ta­tion we don’t have that same rela­tion­ship with our audi­ence. We don’t know their back­ground, we don’t know why they came to the doc­u­ment, and we don’t always remem­ber that com­mu­ni­ca­tion is more than just transmission.

Learning social rules is an ongo­ing process. It’s exhaust­ing and dif­fi­cult for many. Noirin refers to it as run­ning in emu­la­tion. It’s like boot­ing up a vir­tual machine to try and under­stand how some­thing works in a dif­fer­ence context.

Oblique Strategies from Brian Eno was men­tioned. It’s a way to help with cre­ative prob­lem solv­ing. So when you’re stuck on a prob­lem you can draw a card and apply that to the sit­u­a­tion you’re fac­ing. They’re not so much advice as a means to remind you how to think about problems.

Noirin dis­cussed a few strate­gies for mak­ing our docs more emo­tion­ally engag­ing. First, we have to under­stand expec­ta­tions. This applies to many aspects of our com­mu­ni­ca­tion. The expec­ta­tions our users have when read­ing doc­u­men­ta­tion, when a boss reads our email, and more are impor­tant to how our text is received.

Most peo­ple assume their incom­ing com­mu­ni­ca­tion has tact attached to it. We don’t assume com­mu­ni­ca­tion is rude and abra­sive. When it is, it sur­prises us. To solve this Noirin rec­om­mends we all attach a lit­tle tact to our output.

The next strat­egy Noirin cov­ered is to argue that zero is not neg­a­tive. If we can try to rec­og­nize when we’re pro­ject­ing neg­a­tive emo­tions in to a space that has no emo­tion we’re assum­ing. If it’s unclear what the emo­tional con­text is, ask. That’s the only way you can be clear about the intent of a message.

If you trans­mit­ted a mes­sage and a dif­fer­ent mes­sage is received the onus is on you. You have to make sure your audi­ence under­stands what they’re read­ing. Communication is a two way medium and if some­thing is mis­un­der­stood it’s not entirely the reader’s fault. The reader is the only thing that mat­ters with doc­u­men­ta­tion. When in doubt we should rephrase some­thing. If you have to ask whether a sen­tence is gram­mat­i­cally cor­rect, it doesn’t mat­ter. Rewrite it.

The read­ers of your doc­u­men­ta­tion don’t know how you feel. Our read­ers can’t see us, they can’t hear us, they don’t know if we’re hav­ing a good day or a bad day. Stating our emo­tions is a good way to get con­ver­sa­tions back on track. If a con­ver­sa­tion over text isn’t going well, state your emotions.

Noirin rec­om­mends mov­ing through com­mu­ni­ca­tion flow like this: email, IM or IRC, voice, video, real life. Those are in increas­ing order of fidelity. If email doesn’t work, move to IM. If that doesn’t work, move to voice. As she put it, “the fastest way to pass a Turing test is to pick up the phone.”

Perception is real­ity. If some­one feels attacked, for exam­ple, they will shut down. That inher­ently makes their feel­ing real­ity. Reality is not what you’re try­ing to com­mu­ni­cate, it’s what they’re feeling.

Noirin’s last point is that if it doesn’t mat­ter, do it their way. Don’t be a stub­born fool just because you want it done your way.

Write the Docs: Teresa Talbot — Technically Communicating Internationally

I’m at Write the Docs today in Portland and will be post­ing notes from ses­sions through­out the day. These are all posted right after a talk fin­ishes so they’re rough around the edges.

Teresa con­tin­ued after­noon ses­sions by talk­ing about the why and how of work­ing abroad. She’s been a tech­ni­cal writer for about 20 years and spent 7 of those years work­ing out­side of the United States.

There’s a strong demand for tech­ni­cal writ­ers out­side of the US. This is largely because English is the most-spoken sec­ond lan­guage in the world. Lots of tech com­pa­nies abroad wanted English-speaking tech­ni­cal com­mu­ni­ca­tors. Teresa has even worked for com­pa­nies in the UK because they sought a US-specific tech­ni­cal translator.

The first route to work­ing abroad is to have a com­pany spon­sor. This is what allowed Teresa to work and live in Holland. While this gives you cer­tain ben­e­fits like state-run health­care and what­not it also more directly sub­mits you to the more unique aspects of that country’s tax and employ­ment laws.

Another route is to work as a con­tract­ing American for an inter­na­tional com­pany. Teresa did this for a trans­la­tion com­pany work­ing in Japan. Since Teresa was billing from a US social secu­rity num­ber she didn’t need a work per­mit which made things more convenient.

You can also start a com­pany abroad. Teresa did this in Bulgaria and while she had a busi­ness license she never did get a res­i­dency permit.

Overall Teresa’s talk dove in to lots of the nitty gritty in work­ing abroad. Not the best con­tent for notes but I noted what I could. :)