Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Tae
Oct 24, 2010

Hello? Can you hear me? ...Perhaps if I shout? AAAAAAAAAH!
How expensive are these mansions, jeez

Adbot
ADBOT LOVES YOU

golden bubble
Jun 3, 2011

yospos

Towa and Roberu manage a near perfect EN/ID and JP split in viewship. Nene and Botan have some special fans. Nene is the only JP Hololive with a noticeable Spanish following. Botan is the only one with a significant Russian following.

Revolver Bunker
May 12, 2004

「この一撃にかけるっ!」

Tae posted:

How expensive are these mansions, jeez

Depends on location. I have relatives in Hong Kong that live in apartments that are essentially an entire floor of a building and run in the hundreds of thousands of dollars. If its Tokyo and in any of the pricy areas, like Roppongi Hills/Azujuban area then I would guess similar pricing to own. It would be close to the cost of buying a condo in the USA.

astr0man
Feb 21, 2007

hollyeo deuroga

golden bubble posted:

Towa and Roberu manage a near perfect EN/ID and JP split in viewship. Nene and Botan have some special fans. Nene is the only JP Hololive with a noticeable Spanish following. Botan is the only one with a significant Russian following.

Is this just based on what % of their chat messages are in EN vs JP, or was it actually looking at whatever info's available on people's yt/google account profiles?

Amp
Sep 10, 2010

:11tea::bubblewoop::agesilaus::megaman::yoshi::squawk::supaburn::iit::spooky::axe::honked::shroom::smugdog::sg::pkmnwhy::parrot::screamy::tubular::corsair::sanix::yeeclaw::hayter::flip::redflag:

astr0man posted:

Is this just based on what % of their chat messages are in EN vs JP, or was it actually looking at whatever info's available on people's yt/google account profiles?


The person that did it wrote up a giant post on the Hololive reddit about it:

https://www.reddit.com/r/Hololive/comments/l2f9en/which_member_gets_the_most_english_chat_messages/ posted:

TL;DR

No collabs, no “English only!” challenges, and no “English study/talk” streams included

No messages consisting solely of emojis, punctuation, numbers, or ‘w’ spam counted

EN / ID is anything that uses only A-Z, ES is anything that uses Latin characters but goes beyond simple A-Z (eg diacritics), RU is anything that uses Cyrillic, JP is everything else

Dataset is, in general, around the most recent ~10 streams of each member’s, with more added if needed to hit 15 hour and 50,000 message minimums (minimums not applicable to Holostars)

Graphs round to 1 decimal place and don’t show percents below 1%, so stuff doesn’t always add up to 100%

I made specific notes about Miko, Haachama, Pekora, Coco, and Towa below. Please read those first if you have a question, concern, or particular interest about any of those members’ results.

I will not be doing HoloID or HoloEN as their charts will just be a bunch of 99% or 100% EN / ID

Introduction

I’ve always been curious about the language breakdown of Holo members’ chats – who gets the most English messages, who gets the fewest, what percent of their chat is English, just how many Russian messages does Botan get, etc. – so I thought it would be a fun project to analyze the data and try to answer these questions. For this, I wrote a program that reads each of the chat messages on a stream, determines what language it is, and collates all the data, and then I graphed that data. As the images say, all in all I ended up analyzing almost 3 million chat messages, and these are the results.

Data Collection Methodology

I first had to determine exactly where to get the messages to analyze. My goal for this project was to get the language breakdown of the average stream for each member. I didn’t want the data to be skewed by content such as unique, one-off streams, especially ones that had a specific language-focus to them. To that end, I established 2 rules for determining which streams to analyze – (1) no collabs, as collabs run the risk of the other collab member’s audience too heavily influencing the chat of the streamer I was observing, and (2) no language-focused streams, in other words, no “English only!” challenges, no “English study” streams, etc. To note that rule (2) had a very minimal effect and only ended up excluding 2 Sora streams, 1 Shien stream, and 1-2 Coco streams (see below for more about Coco).

Next, I had to determine how to parse each message. The first step was a bit of preprocessing – if a message was solely numerical, an emoji, punctuation marks, only ‘w’s, or any combination of these, I discarded the message entirely and did not count it towards any individual language or towards the total number of messages, as such a message could not accurately be assigned to any individual language. Next, I had to place each message into the corresponding language bucket. In the image, I referred to the four buckets as EN / ID, JP, ES, and RU, but that isn’t 100% accurate due to the parsing algorithm I used. Here is the full definition of each bucket:

EN / ID – Any message that only uses Latin characters found in the English alphabet (A-Z). This primarily captures English and Indonesian (as both only use the 26 standard English letters), but it also can end up mistakenly capturing non-English messages from other Latin-based languages if those messages happened to not use any special letters. This may occur either because the writer was too lazy to properly write diacritics or if that particular message just happened to not contain any. The overall effect of this is that the EN / ID is very slightly over-counted, however the number of people writing unaccented Spanish, French, Italian, etc. messages in Holo members’ chats is extremely low, so the very large sample size should mostly eliminate any real bias this would cause.

ES – Any message written using Latin characters where at least 1 character is a non-English letter. This covers everything from diacritics like Spanish é and German ä to entirely new letters like Scandinavian Ø. While this bucket technically encompasses many different languages, for Holo purposes it’s mostly Spanish (and perhaps Portuguese) messages, so I have merely called the bucket “ES” for convenience.

RU – Any message written using Cyrillic characters. While there are technically many languages besides Russian that use the Cyrillic alphabet, I think it’s safe to say that the vast majority of any Cyrillic messages are going to be in Russian, so I think it’s fair to call this bucket “RU.”

JP – Any message that was not outright excluded in preprocessing and does not fall into one of the above 3 buckets. Due to the extremely large number of characters in the Japanese language, I decided to go with an exclusionary approach to determining if something was a Japanese message. This means that technically any messages not written using either Latin or Cyrillic characters get counted as JP messages. So, for example, messages in Arabic, Chinese, or Korean would end up getting counted in the JP bucket. Similar to the EN / ID bucket, due to the extremely low number of messages in those languages compared to the huge sample size of messages, the effects of this should not really be noticeable.

With all that out of the way, the last step was just deciding which individual streams to use. For this I pretty much just chose whatever the member’s most recent streams were so that I could get the most up-to-date data possible. In two specific instances, which I’ll note below, I did decide to forego a few more recent streams in favor of older streams in an attempt to get a more representative sample of that member’s average stream.

In terms of the volume of data, I used a minimum of 9 different streams per member (the exact amount varies by member based on a variety of other factors), a minimum of 15 hours of content per member, and a minimum of 50,000 chat messages for each Hololive member. Holostars had slightly laxer requirements, as they obviously get less chat messages, but I still used a minimum of 9 streams for each member.

Graphing

For the graphs, I rounded values to one decimal place. I also excluded any values below 1%, as they would be barely visible on most graphs and merely clutter up the graph. As a result, you will notice that many of the charts don’t add up to exactly 100%, due to both rounding errors and not including the small ES and RU percentages. In general, the further away from 100% the two shown numbers add to, the more ES and RU comments that member received.

Channel Specific Notes

Miko – Miko has been streaming a lot of Yakuza lately, which attracts a very Japanese-heavy chat compared to other stream content. I did include some Yakuza streams in her dataset, but I also passed over a bunch to include some earlier Minecraft streams instead in an effort to better represent the average content on her channel. It’s not as if Miko only plays Yakuza, and actually plays Minecraft quite regularly, so it didn’t really make sense to me to have, say, 8 Yakuza streams and 0 Minecraft streams in the dataset.

Haachama – I debated for a long time with myself if I should exempt Haachama from the language-specific content rule and make an active effort to include some of her English-focused streams in her dataset. She is in a very unique position among Hololive members and had previously really been making an effort to make a lot of content specifically for the English-speaking audience. However, I ultimately decided against it, as lately Haachama has not really been doing English-language content outside of collabs (her last solo English stream was on December 17), so I decided that her “average stream”, at least at the moment, is not really English-focused.

Pekora – See Miko. Same thing where I excluded some more recent Yakuza streams for earlier streams of different content. I still included some Yakuza streams, of course, and her Yakuza streams are also very long, so they’ll tend to contribute a large amount of messages. As a result, Pekora’s normal JP % (ie when she finishes streaming Yakuza so much) is probably a bit lower than the data here indicates.

Coco – This was another one that I debated a long time with myself about – should I include Coco’s meme reviews or not? On one hand, she very regularly does them on a schedule, so it can certainly be said that they’re part of her “average content.” But, on the other hand, you can argue that they are language-specific content and will create skew in the chat because of that. Ultimately, I decided to not include Coco’s meme reviews in her dataset. I can certainly see the other argument, too, though, and would not fault anyone who thinks they should be included. I had to make a decision, though, and that is what I chose.

Also on the topic of Coco, I will note that Coco’s language breakdown is very unique among Holo members. On most streams, she gets very, very few English comments – comparable to the lowest overall in Hololive. However, sometimes she randomly decides to speak mostly in English instead of Japanese on some streams, and on those streams she’ll instead get a ton of English comments, even outnumbering the Japanese ones, so this pushes her overall EN / ID percentage up to the 22.8% you see in the graph. Most members have a fairly consistent % across streams within a couple percentage points each way of their average, but Coco’s individual stream %’s instead have extremely high volatility.

Towa – There is one other bit of preprocessing to messages I did that pertains to Towa. Any message solely containing “TMT”, “TMD”, or “TCA” was excluded and not counted in any language’s bucket or in the total count of messages. This is because it’s a fairly cross-language thing to spam these letters at Towa, as you can’t type “TMT” in Japanese really (besides typing out the entire phrase which is a total hassle). Thus, counting them as EN / ID messages would be fairly misleading, as lots of the people typing them are probably actually Japanese.

If you’re curious, about 5.4% of the total messages in Towa’s chat (not counting emoji, numerical, etc. messages in the total) are some variation of “TMT.”

General Observations and Comments

Looking at the raw data, language shares tend to vary heavily with the type of content being streamed, as one might expect. Talking streams tend to get very low English shares, while gaming streams tend to get more, and singing streams the highest English share of all types of content. Among gaming streams, the exact game being streamed also seems to make a noticeable difference. Games like Fall Guys, Apex, and GTA attract many more EN / ID comments than games like Yakuza, ARK, or Mario Kart. Minecraft seems to be a fairly neutral game, with it not showing any consistent deviation one way or the other from each member’s average.

The total messages per hour chart might surprise some people, particularly the fact that Pekora is not even being in the top 5 despite getting by far the most viewers out of all HololiveJP members. (Pekora is actually 8th, if you’re curious, behind the pictured five, Ayame (#6), and Rushia (#7)). If I may offer some explanations, there are a few possible ones that I can think of. It could simply be the case that Pekora simply attracts a lot more lurkers than other members. Perhaps her stream has more mainstream popularity, where many viewers enjoy watching it for entertainment, but aren’t invested enough in the Youtube ecosystem to actually participate in chat. Another explanation might be due to a significant portion of her dataset being made up of Yakuza streams, as I noted before. Perhaps Yakuza simply does not attract many chat messages compared to other types of content, so this drags her average down. A final explanation that comes to mind centers on the way that I processed the data. I immediately discarded and didn’t count any messages which consisted solely of an emoji, and, in my experience at least, Pekora’s viewers – for whatever reason – tend to spam emojis a lot more than other channels’ viewers do. It could be the case that Pekora’s chat actually does get the overall highest messages per hour, but my algorithm simply discarded most of those messages since it was primarily focused on language parsing and counting total chat messages was just a fun side statistic.

As a final comment, I would just like to remind everyone that this is, of course, not a definitive analysis. While I tried to be as rigorous as possible in my methods, ultimately this is only an analysis using around 10 streams of each member. If you extended the dataset to 30, 50, 100, etc. streams, you may find that you suddenly come up with different numbers. Collecting just this much data took me over a week, though, so you’ll forgive me if I wasn’t about to go catalogue the last 50 streams of each member. That said, other than any specific points of interest that I noted above, I do believe that the data presented here should be fairly accurate and that any additional data collection would lead to, at most, only a few percentage points swing in either direction.

What about HoloID and HoloEN?

I considered extending my analysis to EN and ID, but ultimately after doing a couple test experiments, their chats – even for members who can speak Japanese – are almost exclusively EN / ID messages. All other buckets would likely fall below the 1% threshold, or at best be barely above it, and looking at a bunch of 99% and 100% pie charts is not very informative or interesting, so I will not be doing the same analysis for HoloID or HoloEN.

Closing

If there’s anything that I didn’t mention here that you’re curious about, whether it be about the data itself, my methodology, or whatever, feel free to ask and I’ll do my best to answer. Oh and I apologize for the (lack of) graphic design in the images. I’m good at coding / statistics, not art.

Brunom1
Sep 5, 2011

Ask me about being the best dad ever.
https://www.youtube.com/watch?v=-zyKnRnzPao

Coco checks out the local cabaret and her manager lets out a sigh of relief upon hearing that she changed her mind on learning how to pole dance.

Mazerunner
Apr 22, 2010

Good Hunter, what... what is this post?
If you meet a child traveling upon the road...

lap him

https://www.youtube.com/watch?v=bVgAmfK2a9o

SpartanIvy
May 18, 2007
Hair Elf
Here's Kiara talking about how her old friends have been finding out she's a vtuber and it's an interesting insight into it


https://www.youtube.com/watch?v=V1Pm63WE04U

Taratang
Sep 4, 2002

Grand Master

astr0man posted:

Is this just based on what % of their chat messages are in EN vs JP, or was it actually looking at whatever info's available on people's yt/google account profiles?
It's just chat messages - actual viewer metrics are not public information and rarely discussed openly (especially after the whole Taiwan incident), but Pekora did mention a little while back that she was shocked her overseas viewership had climbed to over 50% of her total viewers.

Taratang fucked around with this message at 20:37 on Jan 22, 2021

trucutru
Jul 9, 2003

by Fluffdaddy

Taratang posted:

It's just chat messages - actual viewer metrics are not public information and rarely discussed openly (especially after the whole Taiwan incident), but Pekora did mention a little while back that she was shocked in a recent management meeting when they showed her her overseas viewership had climbed to almost 60% of her total viewers.

Yeah, Pekora mentioned some of her numbers in her end of year stream and there are a shitload of overseas people watching her.

Korone must have similar numbers, specially with how language agnostic lots of her streams are.

golden bubble
Jun 3, 2011

yospos

From the day 6 Apex Scrims

Potsticker
Jan 14, 2006


quote:

Also on the topic of Coco, I will note that Coco’s language breakdown is very unique among Holo members. On most streams, she gets very, very few English comments – comparable to the lowest overall in Hololive. However, sometimes she randomly decides to speak mostly in English instead of Japanese on some streams, and on those streams she’ll instead get a ton of English comments, even outnumbering the Japanese ones, so this pushes her overall EN / ID percentage up to the 22.8% you see in the graph. Most members have a fairly consistent % across streams within a couple percentage points each way of their average, but Coco’s individual stream %’s instead have extremely high volatility.

Wow how weird is that if I meticulously cut out specific language content from all the other members, that Coco's streams that (even after cutting most of her content that is) still manages to get different comments based on which language she's primarily speaking in her stream!

This person's methodology is trash and trying to force a specific result by throwing out data that conflicts with it. If a streamer is speaking in one language or another, matching the comments to the language they are speaking in is only natural.

Misanthropic Void
Oct 9, 2012

I seem to remember Subaru talking about its overseas viewership being around 30% so those statistics are a bit bad as I can bet you most people don't type anything. Me for example, I just full screen the stream and talk the respective streamer fan discord.

Lyer
Feb 4, 2008

I just talk out loud in hopes they'll hear me

In all seriousness, the time of day and avg number of streams during that time period, greatly influences the outcome as well.

Sundae
Dec 1, 2005
Mod and OP OK'd this via PM already. I promise I won't spam you guys with more stuff if you're not interested. :smith:

A good friend of mine fell down the VTuber rabbit hole last year and decided to launch her own character/channel this year. I originally groaned at the idea because holy gently caress are there a lot of bad indie VTubers, but I have to admit that she's put together an actually good character and setup. So, I ate my words and said I'd help her get word out that she's starting. I'd like to share her with you folks to see if any of you would be interested in checking out her debut next week at 20:00 Pacific on 29JAN.. :) I actually think she's got something pretty awesome going on with this one. :)

Her ship crashed on earth while on auto-pilot (it was sandwich time) and now she's pretending to be a summer-camp counselor while she repairs it in secret. The kids have all gone home for the year, so she's streaming while she has the whole place to herself. Her content will feature video games, raiding the counselors' secret booze stash for drunken chat, gesture/figure drawing sessions as well as longer art projects (think Ina's art streams), and the occasional karaoke stream. Depending on viewership, she may also do Hindi-language streams on occasion since she is bilingual.

https://www.youtube.com/watch?v=e0Dt5SRzLF8

Her Channel



Debut week schedule:


Luna promises she won't kill you. Probably.



I'm trying to think of someone to compare her to (you know, "if you like X, you might also like Luna), but I'm honestly having trouble with that. She's sarcastic, witty, a good artist (not manga, FYI), but she sort of fits into her own little category personality-wise, at least among the EN-language tubers I've seen. Maybe there's a JP or ID who is comparable, but I don't speak the languages. As best I can tell, she's her own person and I'm honestly pretty hyped for the launch. :)


TL;DR?
1) Female EN streamer who streams in NA times.
2) Live2D, 1080p 60fps stream.
3) Remarkably high-quality rig/model for an indie, and I'm seriously jealous of her PC.
4) Games, art, singing, drinking, and collabs TBA.
5) Jan 29, 8:00 PM Pacific Time.

She says, "Be there or you'll miss out on the cookies." If you guys have any questions, I can relay them to her, or I can STFU and we can all go watch idiots in chat talk about Kiara and Calli's wedding for the hundredth straight stream.

clockwork chaos
Sep 15, 2009




wow :3 thats a neat character design
i like it

Julias
Jun 24, 2012

Strum in a harmonizing quartet
I want to cause a revolution

What can I do? My savage
nature is beyond wild

That's a pretty good name/design. Kudos to your friend, and good luck to her and her foray into vtubing!

Jerkface
May 21, 2001

HOW DOES IT FEEL TO BE DEAD, MOTHERFUCKER?

Fallen Rib
Yea love the LIVE2D there, yet another Indie with an amazing rig.

Pollyanna
Mar 5, 2005

Milk's on them.


Oh man that is a good design.

Brunom1
Sep 5, 2011

Ask me about being the best dad ever.
https://www.youtube.com/watch?v=nz-nmF4ByNI

Watson's starting Mario Galaxy in twenty minutes.

https://www.youtube.com/watch?v=qTwHCC4stu8

2 hours later, Gura has her celebration karaoke stream.

Kwyndig
Sep 23, 2006

Heeeeeey


I love that thumbnail, that's a really cute Gura.

Lyer
Feb 4, 2008

PPT still going at it with the voice over, crazy.

Onean
Feb 11, 2010

Maiden in white...
You are not one of us.
"YEAH!"
...
"No."
https://twitter.com/EngVTubersOOC/status/1352366895144513539

sb hermit
Dec 13, 2016





Julias posted:

That's a pretty good name/design. Kudos to your friend, and good luck to her and her foray into vtubing!

:same:

Lt. Lizard
Apr 28, 2013
Gura's Visage streams are extremely powerful. (can't believe this part wasn't clipped yet)

https://www.youtube.com/watch?v=wuUpQStQ6Ss&t=4080s

Talorat
Sep 18, 2007

Hahaha! Aw come on, I can't tell you everything right away! That would make for a boring story, don't you think?

Wait. Holy poo poo. Has the same emote fish been a shark because same this whole time.

Revolver Bunker
May 12, 2004

「この一撃にかけるっ!」
I'm watching Haachama's stream from 8 hours ago and its certainly something. https://www.youtube.com/watch?v=zDYyOKP2nMs Sock gloves, gummy bear tarot cards, low lighting on a dark table, and her going through the list of fellow members. At some point I could almost hear Mio rolling around in her sleep. Now that she's buried Akai Haato in a deep pit she's slowing making her way to the others.

Ririsya came out with a new original song and MV to celebrate her birthday.
https://www.youtube.com/watch?v=jBgCkYECapk

Rubellavator
Aug 16, 2007

Talorat posted:

Wait. Holy poo poo. Has the same emote fish been a shark because same this whole time.

Yes. It's a poster from nichijou

Misanthropic Void
Oct 9, 2012

Revolver Bunker posted:

... Sock gloves, gummy bear tarot cards, low lighting on a dark table, and her going through the list of fellow members...

If only her crimes against tarot (or cards in general) stopped there. Her shuffling and drawing methods where.... original to say the least.

Revolver Bunker
May 12, 2004

「この一撃にかけるっ!」

Misanthropic Void posted:

If only her crimes against tarot (or cards in general) stopped there. Her shuffling and drawing methods where.... original to say the least.

It reminded me of shaman/voodoo magic where they'd cast bones and then read the positioning. Only Haachama is using gummy bear cards that someone come off as more cursed cards then they really should be. Something about how those bears are drawn onto the cards.

Kata-Haro
Oct 21, 2010

Nene and Polka really have some strong sibling energy going

https://www.youtube.com/watch?v=ASSmcRounNU

kirbysuperstar
Nov 11, 2012

Let the fools who stand before us be destroyed by the power you and I possess.
I also have a friend who is a budding vtuber (she got twitch affilate back in November, though!) that I'll quickly plug if that's okay.

https://twitter.com/ZafraZaleska
https://www.twitch.tv/transienttiefling

Games, some drawing stuff, usually has a friend or two with her.

Ok thanks, I'll now go back to replying to anyone I disagree with using the "Shut up beeech" Pekora video.

Jerkface
May 21, 2001

HOW DOES IT FEEL TO BE DEAD, MOTHERFUCKER?

Fallen Rib

Kata-Haro posted:

Nene and Polka really have some strong sibling energy going

https://www.youtube.com/watch?v=ASSmcRounNU

I love them both, can't wait for Nene's new outfit it should be next weekend??? Soon?

Pyronic
Oct 1, 2008

ROYAL RAINWHARRGARBL

Jerkface posted:

I love them both, can't wait for Nene's new outfit it should be next weekend??? Soon?

new outfit reveal's the 31st i believe.

Falls Down Stairs
Nov 2, 2008

IT KEEPS HAPPENING
https://www.youtube.com/watch?v=qTwHCC4stu8 We've got a shark karaoke starting now

Benne
Sep 2, 2011

STOP DOING HEROIN
Gura's 2M karaoke is live https://www.youtube.com/watch?v=qTwHCC4stu8

Takoluka
Jun 26, 2009

Don't look at me!




kirbysuperstar posted:

I also have a friend who is a budding vtuber (she got twitch affilate back in November, though!) that I'll quickly plug if that's okay.

https://twitter.com/ZafraZaleska
https://www.twitch.tv/transienttiefling

Games, some drawing stuff, usually has a friend or two with her.

Both of these are great! We should post smaller Vtubers more often.

Amp
Sep 10, 2010

:11tea::bubblewoop::agesilaus::megaman::yoshi::squawk::supaburn::iit::spooky::axe::honked::shroom::smugdog::sg::pkmnwhy::parrot::screamy::tubular::corsair::sanix::yeeclaw::hayter::flip::redflag:
I hope Gura does a Calli rap.

Kyte
Nov 19, 2013

Never quacked for this

That's a really good design! Only thing I'd point out is that the boing is a bit much. She might want to turn it down at some point.

Adbot
ADBOT LOVES YOU

Falls Down Stairs
Nov 2, 2008

IT KEEPS HAPPENING

ShallNoiseUpon posted:

I hope Gura does a Calli rap.

She's doing Hinotori right this second which officially put the possibility on table for me

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply