Varytale Analytics

So Bee has been out for a couple of days now in reader beta, and my analytics page overflows.

The Varytale system includes a mechanism by which readers can rate and comment on any individual piece of the story as they go along, giving a one to five star ranking and displaying the average of those ranks as the book’s quality score. The commenting part is well-hidden and of course requires more effort on the part of the reader — I’ve gotten only a handful of comments, mostly to inform me of localized typos or bugs — but the ratings part is almost intrusively prominent.

As a reader, I’m not sure how I feel about being asked to grade what I just read every few paragraphs, so I haven’t actually had the nerve to grade anyone else’s Varytale books; and for that matter even just being asked, “hey, how did you like that?” so frequently relentlessly draws my attention back to an evaluative process when I might prefer simply to experience the story for the time being.

So I have mixed feelings about it as a reader, but it’s the authorial perspective I want to mostly talk about here.

As an author, you get a big chart that ranks all of the ratings of all of your storylets. Analytics results also break down further details about how many people chose each of the several paths through a branching storylet, in what order, and so on. You can see exactly how many people read each storylet, and on which dates. There’s no way to tell for sure when a given reader stopped reading your book, because in theory they could just not have finished it yet, but lots and lots of other metrics are visible.

This is the first time I’ve been able to collect that kind or level of feedback on any of my work, and I am morbidly fascinated. I’m going to show the top and bottom ends of the chart for Bee, which will necessarily be just a tiny bit spoilery for the names of sections in the story.

Here are the top-ranked storylets in my piece (as of when I started writing this blog post):

Because the storylets with a 5.0 ranking go all the way to the top, we find that the stand-outs are the ones that not that many people have ranked. They’re rare and/or late in the story.

Those just below them with high 4.7-4.9 ratings — the ones given a pretty good or great rating by lots of people — tend to be storylets that a) are relatively plotty and b) come as the culmination of some extended development. Typically, they are things that the reader will only encounter once, too, in contrast with the “drillwork” storylets that can be repeated several times. Most of those pieces, as well as most of the stories that come in the middle of a developing plot arc, get more middling scores.

On consideration, this is not really that surprising. I mean: the parts of books I like best, and find most memorable, are often culmination scenes of some sort or other. The build-up isn’t as inherently punchy and memorable, but that’s because it has a different job to do! That doesn’t mean it doesn’t belong in the story.

On the other hand, this feedback might also mean that people aren’t crazy about the book’s structural decision to incorporate very short vignettes about the protagonist’s chores and training drills, which can be read (though often with different variations) multiple times in the course of the story. Perhaps some readers would prefer that the content all be entirely big-plot stuff.

(A side note: when Varytale’s action meter was still functioning, meaning that you could only read small chunks at a time, I had Bee set up so that most of the big/plotty pieces were two credits and the repeatable ones were one credit — not because I wanted to gouge people for credits but because I wanted to communicate clearly where the big pieces of story were to be found. On the whole I’m entirely happy to have the action meter no longer working, but that means that the structure doesn’t necessarily communicate to readers any more whether a given piece is going to be significantly story-advancing, and they have only the storylet titles to work from.)

Now here’s the bottom of the chart:

…and this demonstrates something that makes the rating system a bit challenging as a way to grade books’ overall quality.

Four of the five bottom-ranked storylets are the first four storylets you read when you start up the book. It seems like a reasonable bet that people who rated them 1 star weren’t grabbed enough by the story to continue, so their input is filtered out from the later stages.

In one way, it’s useful that relatively little of the book’s feedback comes from people who thought the opening sucked. As an author, I’m certainly more interested in the detailed feedback of people who are in my audience. If someone doesn’t like my subject matter and writing style, having that person go through and negatively grade every single one of the storylets wouldn’t help me hone my craft or improve the reading experience for people who do like the kind of thing I want to write.

However, this also means that a book’s average rating on Varytale is likely to include many — perhaps even dozens — of ratings apiece from people who liked the book enough to play a lot of it, and only a couple negative ratings from people who hated it enough to stop reading quickly. This distinguishes Varytale’s rating system from that of, say, IFDB. On IFDB, you might get one star from someone who hated your game and 5 stars from someone who loved it, for an average of 3. On Varytale, unless I drastically misunderstand the system, you can get ratings of {1,1,1} from someone who then gave up in disgust, and ratings of {5,4,4,4,5,5,5,4,4,5,3,5} from someone who played all the way through, for an average of 3.7.

Okay, so lots of caveats. But this feedback has already been pretty useful to me in a couple of ways. First, if I mentally group storylets by type — big plot piece or little textural piece? beginning, middle, or conclusion? one-off or repeater? easy to reach or rare? — then I can pull out more interesting information from how a storylet performs relative to others of its class.

For instance, in the bottom-ranked picture, there’s one storylet, “Fashion Comment,” that did unexpectedly poorly. It’s a one-time-only story point, appearing partway into the narrative, and it’s about a character whose stories otherwise are graded much higher.

I went back to have a look at it and realized that I could see a problem that hadn’t occurred to me. It didn’t seem conspicuously worse written than similar pieces. But it did paint that character in a somewhat more unfriendly, mean-spirited light than his other appearances. It wasn’t extreme enough that it felt implausible to me-as-author, but maybe it was jarring to readers who had built up a more sympathetic concept of that character. I felt like I had the choice of either deepening that moment to justify it and incorporate it better into his personality, or dropping it entirely. On consideration, I chose to drop it. I don’t know for certain whether my analysis is the same one my readers made — this method is a lot less articulate in some ways than having a skilled editor or beta reader tell you exactly why certain things don’t belong in the story. But the analytics do give you feedback about when you might want to give something another look.

There are other patterns emerging too, which I may try to address as I have time.

Even though I can explain why the opening segments have a lower average than everything else, for instance, that fact encourages me to see whether I want to take another pass at them. And though I think the drill and task-work elements of the story are an important part of its texture, I can see some indications that some of these work better, perhaps feel less repetitive, than others, and I take that to heart.

The other thing that makes this work is that it’s extremely easy to push new content live on Varytale without disrupting anyone’s experience. Everyone’s bookmarks still function; no one has to throw out a save file or redownload anything. They just happen to get the upgraded content the next time they click on that spot.

Overall, then, the analytics tools take a bit of thoughtful parsing to understand — it’s not quite as simple, necessarily, as “the good bits of your book are the high scoring ones, so make more like those.” But they are really useful all the same, and none of the tools I’ve typically used for branching or interactive stories have offered anything quite like them.

On the other hand, the ratings system may have some serious limitations when it comes to helping new readers decide what they might like to try. Stripped of the storylet-by-storylet granularity, a book having an average between 4 and 5 probably just means “people who liked this book enough to read through it liked it pretty well” — not so very informative.

Finally, I should add that Ian Millington, the chief technical force behind Varytale, is constantly refining the tools and experience, so I wouldn’t be surprised if these features continue to evolve over the coming months. Everything I’ve said here is just a reflection of what I’m seeing right now, not of how it always will be.

Still, it’s cool stuff. More about the actual process of writing later.

27 thoughts on “Varytale Analytics

  1. I definitely found the rating part excessive. I think I only rated 2 or 3 times (I played to an end). I think I was more likely to rate something that stood out positively, for what it’s worth.

    I didn’t notice a way to make textual comments, by the way. I was a bit disappointed by how things came out, but not really interested enough to play through the whole thing again to see if I could change the outcome.

    • Hm, okay. I don’t know whether this would have been useful to you, but: the system lets you create extra bookmarks in the story. If you click the little bookmark ribbon on the upper left, you get a sort of bookmark-manager thing that lets you duplicate your current bookmark — essentially the same thing as making a save file.

  2. It wouldn’t allow the sort of metrics-gathering that seems largely the point here, but I think I’d actually find a full text box for comments on each storylet more inviting / less intrusive than the 5-star rating. Maybe that could be a sort of premium feature? I like the idea of an author having such a direct line to their readers.

  3. I find myself really unwilling to score the text during normal play, too. If I’m going to take a second pass in analysis mode, maybe. But the lack of distinction between ‘you have grammatical errors’, ‘you contradicted earlier continuity’, ‘this section is less interesting than your other sections’, and ‘I disapprove of what you’re doing with your characters’ would still drive me nuts. And my instinct is to rate sections relative to the general quality of the book, but it seems that what you’re doing is in fact contributing to the book’s rating relative to other books.

    Of course, these problems are to some extent true of any ratings system, but there’s something about the scale of this one that I don’t dig, or don’t get.

  4. Oddly, I don’t think I got *any* of the five high-ranking storylets in the top chart — I can’t be sure, but none of the names are familiar, whereas I do recognise the lower-ranking ones. Obviously I need to play it again.

    I do also wonder how much players will tend to rank things depending on how much they *enjoy* a story, vs how good they thought it was; I’m particularly thinking of the sequence where the protagonist persuades Mom to go back to the salon for a haircut and it’s horribly expensive. To me, that was decidedly uncomfortable reading (although it was pretty heavily signposted), even though it was an important piece of character development. How did that scene rate?

  5. As one UX data point, I thought at first that I was _required_ to rate each passage before continuing. The prompt “Before finishing, please comment or rate this” led me in this direction… I thought that the passage would not count as completed until I did.

  6. I didn’t rate as I played because, as some others have said, the inability to explain or rationalise a rating was offputting – I didn’t see the comment option. It also felt kind of demanding; I just wanted to enjoy the story. (Which I did).

    I’m really interested in hearing about the writing process, and the interface for creating the piece. Do you know if Varytale is something that will be open for contributions from the cloud in the future?

    • I’m really interested in hearing about the writing process, and the interface for creating the piece.

      I’m working on a post about that; it will probably be a few days before it’s ready.

      Do you know if Varytale is something that will be open for contributions from the cloud in the future?

      I believe the intention is to gradually open up the tools — and as you can see on the page, there is a section for “user submitted content.”

  7. Emily, this is a fantastic story. It speaks to me on a lot of levels. Thank you.

    As for the Varytale format, I found that at times when I scored one branch of a storylet, but then explored the other branch (such as Latin or Greek word roots storylets – btw, the comment that everything comes down to a vocabulary problem is simply brilliant, but I digress) it kept the same score I offered when reading the alternate wording. It seems odd that each sub-branch wouldn’t get its own unique score. That said, about 2/3rds of the way through, I grew weary of scoring and simply clicked through the story.

    It’s possible that (many?) readers may appreciate the “choose your own”/”choice of” aspect of Varytale, but not participate effectively in the scoring aspect. In that case, I might not rewrite a storylet based on positive or negative scores, until I first considered the overall volume of click-through vs # of scores to weigh whether a silent majority’s silence implies I should leave the storylet alone.

  8. I don’t want the scoring to feel compulsory at all. If it distracts, skip it. If you’re a freeform rather than a stars person, all good. I know from my book its great to get feedback at all. But I’d rather readers enjoyed the reading.

    So there’s a difficult balancing act. Some folks feel like the feedback tools are in their face and feel compulsory. Others are making feature requests that we add a way to comment. So I wonder how best to strike that balance. Suggestions very welcome!

    I agree about wise use of data, Emily. I would say the low ratings at the start of a book (which is true of most but not all the books on the site at the moment) also might be an indicator about how to ease reader’s into the story. I don’t think anyone has given good thought to really writing with this much data.

    To others who are interested, I’d also say that ratings aren’t the only data. I’ve found the timing information interesting. We can see how long it takes folks to read each section of each story. I’ve noticed in How to Read, that some of the sections tend to be skipped. I wonder if they’re too obvious?

    As for writer’s credentials, I’m adding 10 people a day, email me at to join the queue.

    • I would say the low ratings at the start of a book (which is true of most but not all the books on the site at the moment) also might be an indicator about how to ease reader’s into the story

      Yes, this is true too. I do take this as an indication I should take another pass at the opening, but I wanted to do a little bit of an experiment and see whether I was on the right track about how before I committed across the board.

      My hypothesis was that these opening segments were too much written from a game-tutorial mentality that I had to introduce important elements of play, and too little from a fictional perspective; and that people would prefer them to be a bit longer and more show-not-tell — closer in form to the dramatic storylets that occur later in the book. (The way they were originally written was meant to be sort of quick to breeze through before settling into the meat of things, and not too irritating to replay if people had already been through the book once… but possibly that’s the wrong thing to be worried about, since if people don’t get into it the first time, they’re certainly not coming back.)

      So I did some tweaks on the Lettice opener: nothing drastic, but just enough to make it less perfunctory and give longer, more personal text in a few places. Since I did that, it’s been steadily pulling ahead of the other opening passages in the stats: way fewer 1-star ratings, a couple of positive comments. So that seems like the right way to go, and I mean to revise the other pieces in the same light when I have a chance.

      • For what it’s worth, I think this hypothesis describes my experience fairly well. I gave up quickly on this story, for two reasons:

        1. Although I could see I was picking the order in which the early storylets were displayed, and modifying little bits of flavor text that appeared at the end of the early paragraphs, I otherwise saw no indication that my choices were meaningful in any satisfying way. The feedback I got was not very juicy, at least at that early stage.

        2. The writing was more workmanlike than I expected. I was very surprised, as I normally find your prose immediately evocative and engaging. It hadn’t occurred to me that I was playing through a tutorial and that the meat of the game might be written (and/or structured!) differently.

        I do think a tutorial is a great game design choice, but because it wasn’t signposted, I felt that my good faith was expended rapidly. The experience of the early game wasn’t one I particularly wanted to continue, and I had no idea it wasn’t representative. Now, I don’t know if that’s a solvable problem, without too much breaking of the fourth wall or screwing with overall tone too much. It may be I’m just not the target audience for this format. But if the feedback is helpful, there it is. Thanks, as always for all of your hard work to advance interactive narrative media.

      • That is useful feedback — thanks!

        And yeah, the first four storylets are essentially there to give you information you won’t be able to work without during the later story, so you have to go through all of them. After that, there’s a lot more variety and agency, it’s possible to skip storylets you’re not interested in, and your decisions will quickly start to determine what subplots and options are presented to you.

        But that doesn’t mean I couldn’t make the opening sections juicier.

  9. Enjoying this story – am also beta-testing Varytale having spent a bit of time learning the basics of Inform 7. I was wondering if I could ask you more of a ‘theological’ question as someone who’s been at the forefront of parser-based stories for years – are you now sold on choice-based narrative? Or do you miss the ‘infinite possibilities’ of the parser?

    Were there times when you wished that you could create more open-ended/puzzlier situations or was it a relief not having to do so much coding of obscure actions etc?

    I imagine that this answer might form part of your next post but just wanted to cue up the curiosity!

    • I can jump ahead on this one, I suppose — different mechanics serve different types of story, and I don’t feel like I have to give up on one in order to take another for a spin. Despite concerns about the accessibility of parser-based IF, I do still have a pretty large WIP in that form. But it’s nice to try out other things, too, and Varytale’s structure was flexible enough to let me do what I wanted to do with Bee. It would not have been enough for certain other stories. That’s fine.

      • Cool – I thought that you might say that. It’s exciting to know that you’ve got another parser-based game in the works. Do you think that creating a story in Varytale would be a good prelude to creating something in Inform 7? Or are the two platforms too different in the way they work for there to be much crossover?

      • Prelude in what sense? Doing some interactive writing of one type will probably inform how you approach it with another tool, and in that way it might not be a bad thing. But I wouldn’t try to use Varytale to prototype an IF piece (or vice versa) because they really have different rhythms, focus on different styles of interacting, and so on.

  10. That makes sense – I meant more in terms of the skills set than as prototyping of one particular plotline. The way I picture it is that Varytale would be much less challenging in terms of the coding as long as you could keep the branching and combinations of qualities under control. This would leave more onus on the writing which would be my preference for now… I love playing parser-based IF but get the sense that the debugging is a substantial chunk of the work for new authors…

  13. So here’s how clueless I am. The text next to the stars (which seems to have changed over time — I played in April 2013) says “Please consider commenting or rating this,” but I took “this” to mean the ENTIRE WORK, not the little bit of it I’d just experienced. So I didn’t rate anything because I thought, “Well, I’m going to review it anyway.” I actually quite like the notion of being able to “like” a more granular piece of story, but I had no idea that’s what those stars were for.

    For whatever it’s worth, here are some bits I *would* have given five stars to had I known I could (rot13’d for spoilers):

    1) Gur fprar jurer fur ehaf njnl gb Fnen, cnegvphyneyl “Lrnu, gung fbhaqf evtug.”

    2) “Vs lbh jrer abg xvyyrq, lbh jbhyq or pbzcyrgryl fnaqrq qbja, cbyvfurq, cresrpgrq.”

    3) Gur ragver fnyba frdhrapr, sebz gur puncgre jurer vg unccraf gb gur fhocybg bs tbvat onpx.

    4) Gur snoyr nobhg gur jbzna jub unq orra fuhg va nyy ure yvsr nfxvat sbe gur qvpgvbanel.

    5) Ure qrfpevcgvba bs gur Abegu Xbern qbphzragnel. Irel jryy pubfra.

