scikit-learn
from IPython.display import YouTubeVideo
YouTubeVideo('2lpS6gUwiJQ')
scikit-learn
and friends numpy
and bokeh
,Links:
scikit-learn
and friendsscikit-learn
is becoming the de-facto machine learning library for Pythonscipy
and numpy
pandas
and bokeh
File comes from here: http://figshare.com/articles/reddit_user_posting_behavior/874101
%%bash
head reddit_user_posting_behavior.csv
603,politics,trees,pics 604,Metal,AskReddit,tattoos,redditguild,WTF,cocktails,pics,funny,gaming,Fitness,mcservers,TeraOnline,GetMotivated,itookapicture,Paleo,trackers,Minecraft,gainit 605,politics,IAmA,AdviceAnimals,movies,smallbusiness,Republican,todayilearned,AskReddit,WTF,IWantOut,pics,funny,DIY,Frugal,relationships,atheism,Jeep,Music,grandrapids,reddit.com,videos,yoga,GetMotivated,bestof,ShitRedditSays,gifs,technology,aww 606,CrohnsDisease,birthcontrol,IAmA,AdviceAnimals,AskReddit,Endo,WTF,TwoXChromosomes,pics,funny,Jeep,Mustang,4x4,CCW,dogpictures,Cartalk,aww 607,space,Fitment,cars,Economics,Libertarian,240sx,UserCars,AskReddit,WTF,Autos,formula1,pics,funny,bodybuilding,gaming,Drifting,Justrolledintotheshop,atheism,gadgets,videos,business,gamernews,Cartalk,worldnews,carporn,technology,motorsports,Nissan,startrek 608,politics,Flagstaff,Rainmeter,fffffffuuuuuuuuuuuu,pcgaming,screenshots,truegaming,AdviceAnimals,Guildwars2,gonewild,gamingsuggestions,Games,AskReddit,dubstep,skyrim,SuggestALaptop,battlefield3,WTF,starcraft,creepy,pics,funny,darksouls,books,gaming,mw3,hentai,halo,atheism,magicTCG,swtor,SOPA,anime,IndieGaming,Jokes,wow,gifs,Design,NAU,Android,technology,Minecraft,aww,GameDeals,playitforward,pokemon 609,Clarinet,AdviceAnimals,festivals,SubredditDrama,InternetAMA,AskReddit,aves,cringe,MemesIRL,Music,AmISexy,electricdaisycarnival,ForeverAlone 610,RedHotChiliPeppers,fffffffuuuuuuuuuuuu,tifu,civ,gameofthrones,IAmA,AdviceAnimals,movies,explainlikeimfive,SubredditDrama,gonewild,todayilearned,trees,AskReddit,soccer,skyrim,WTF,germany,pics,funny,seduction,circlebroke,sto,gaming,4chan,atheism,circlejerk,Music,apple,cats,videos,John_Frusciante,minimalism,trackers,worldnews,gifs,beermoney,Android,technology,startrek,Frisson 611,beertrade,AskReddit,WTF,beer,batman,BBQ,beerporn,Homebrewing 612,politics,2012Elections,Parenting,IAmA,fresno,picrequests,AskReddit,loseit,WTF,Marriage,Mommit,pics,funny,VirginiaTech,loseit_classic,RedditLaqueristas,atheism,LadyBoners,GradSchool
import pandas as pd
pd.read_csv("reddit_user_posting_behavior.csv", nrows=10, names=["user"]+range(25)).fillna("")
user | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | ... | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 603 | politics | trees | pics | ... | ||||||||||||||||
1 | 604 | Metal | AskReddit | tattoos | redditguild | WTF | cocktails | pics | funny | gaming | ... | trackers | Minecraft | gainit | |||||||
2 | 605 | politics | IAmA | AdviceAnimals | movies | smallbusiness | Republican | todayilearned | AskReddit | WTF | ... | atheism | Jeep | Music | grandrapids | reddit.com | videos | yoga | GetMotivated | bestof | ShitRedditSays |
3 | 606 | CrohnsDisease | birthcontrol | IAmA | AdviceAnimals | AskReddit | Endo | WTF | TwoXChromosomes | pics | ... | Cartalk | aww | ||||||||
4 | 607 | space | Fitment | cars | Economics | Libertarian | 240sx | UserCars | AskReddit | WTF | ... | Drifting | Justrolledintotheshop | atheism | gadgets | videos | business | gamernews | Cartalk | worldnews | carporn |
5 | 608 | politics | Flagstaff | Rainmeter | fffffffuuuuuuuuuuuu | pcgaming | screenshots | truegaming | AdviceAnimals | Guildwars2 | ... | SuggestALaptop | battlefield3 | WTF | starcraft | creepy | pics | funny | darksouls | books | gaming |
6 | 609 | Clarinet | AdviceAnimals | festivals | SubredditDrama | InternetAMA | AskReddit | aves | cringe | MemesIRL | ... | ||||||||||
7 | 610 | RedHotChiliPeppers | fffffffuuuuuuuuuuuu | tifu | civ | gameofthrones | IAmA | AdviceAnimals | movies | explainlikeimfive | ... | skyrim | WTF | germany | pics | funny | seduction | circlebroke | sto | gaming | 4chan |
8 | 611 | beertrade | AskReddit | WTF | beer | batman | BBQ | beerporn | Homebrewing | ... | |||||||||||
9 | 612 | politics | 2012Elections | Parenting | IAmA | fresno | picrequests | AskReddit | loseit | WTF | ... | RedditLaqueristas | atheism | LadyBoners | GradSchool |
10 rows × 26 columns
%%time
user_ids = []
subreddit_ids = []
subreddit_to_id = {}
i=0
with open("reddit_user_posting_behavior.csv", 'r') as f:
for line in f:
for sr in line.rstrip().split(",")[1:]:
if sr not in subreddit_to_id:
subreddit_to_id[sr] = len(subreddit_to_id)
user_ids.append(i)
subreddit_ids.append(subreddit_to_id[sr])
i+=1
import numpy as np
from scipy.sparse import csr_matrix
rows = np.array(subreddit_ids)
cols = np.array(user_ids)
data = np.ones((len(user_ids),))
num_rows = len(subreddit_to_id)
num_cols = i
# the code above exists to feed this call
adj = csr_matrix( (data,(rows,cols)), shape=(num_rows, num_cols) )
print adj.shape
print ""
# now we have our matrix, so let's gather up a bit of info about it
users_per_subreddit = adj.sum(axis=1).A1
subreddits = range(len(subreddit_to_id))
for sr in subreddit_to_id:
subreddits[subreddit_to_id[sr]] = sr
subreddits = np.array(subreddits)
(15122, 876961) CPU times: user 9.77 s, sys: 288 ms, total: 10.1 s Wall time: 10.1 s
Our adjacency matrix is a bit problematic to deal with as-is:
scikit-learn
has a decomposition packageTruncatedSVD
scikit-learn
%%time
from sklearn.decomposition import TruncatedSVD
from sklearn.preprocessing import normalize
svd = TruncatedSVD(n_components=100)
embedded_coords = normalize(svd.fit_transform(adj), norm='l1')
print embedded_coords.shape
(15122, 100) CPU times: user 1min 8s, sys: 4.94 s, total: 1min 13s Wall time: 1min 14s
The output is kind of neat:
%matplotlib inline
pd.DataFrame(np.cumsum(svd.explained_variance_ratio_)).plot(figsize=(13, 8))
<matplotlib.axes._subplots.AxesSubplot at 0x7f4fdd633a50>
# this function will show you the axes on which a particular subreddit scores the highest/lowest
def pickOutSubreddit(sr):
sorted_axes = embedded_coords[list(subreddits).index(sr)].argsort()[::-1]
return pd.DataFrame(subreddits[np.argsort(embedded_coords[:,sorted_axes], axis=0)[::-1]], columns=sorted_axes)
pickOutSubreddit("soccer")
44 | 46 | 45 | 40 | 0 | 42 | 26 | 27 | 47 | 21 | ... | 62 | 52 | 43 | 54 | 69 | 61 | 50 | 49 | 22 | 25 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | bevandele | PrettyOlderWomen | ParisSG | malefashionadvice | AskReddit | DoesAnybodyElse | CraftMasters | blackmailporn | ShitLiamDoes | ManchesterVegan | ... | knockoutgifs | JapaneseFiction | gameofthrones | applehelp | PicsOfJithSleeping | outercourse | osx | ShitLiamDoes | gonewildstories | CraftMasters |
1 | ParisSG | soccerbot | bevandele | malefashion | funny | UrethraPorn | Nbz | underboob | silverandguns | asutosh | ... | outercourse | davidfosterwallace | shewantstofuck | macsetups | EdmontonOilers | mindashq | simpleios | techsupport | BDSMpersonals | Nbz |
2 | soccerbot | bevandele | FootballMedia | goodyearwelt | pics | HotDogPorn | pokemon | Sexfight | techsupport | goatedition | ... | mindashq | whatsthatbook | asoiaf | apple | hockeylayouttest | NBA2k | ios | bubbleswithfaces | BDSMcommunity | pokemon |
3 | FootballMedia | FTLStrikers | soccerbot | rawdenim | WTF | SpidersGoneWild | pokemonteams | gangbang | uscgames | GOAT_TRUTH | ... | Articles | memphismayfire | aSongOfMemesAndRage | ios | hockey | StateFarm | retina | talesofmybowels | bdsm | vanguardtips |
4 | FTLStrikers | ParisSG | FTLStrikers | frugalmalefashion | gaming | SomeRandomReddit | vanguardtips | The_Porn_Family | Goblinism | worldnews | ... | postcolonialism | Earlyjazz | asoiafcirclejerk | retina | hockeyplayers | nba | iOSProgramming | 24hoursupport | polyamory | AsianFeet |
5 | PrettyOlderWomen | FootballMedia | PrettyOlderWomen | yusufcirclejerk | AdviceAnimals | FuckingFish | AsianFeet | ChangingRooms | Transmogrification | sticknpokes | ... | PeterL | alt_lit | AGOTBoardGame | jasmineapp | leafs | Basketball | appletv | ouch | BDSMGW | thongsandals |
6 | soccer | reddevils | coys | AustralianMFA | IAmA | SnooPorn | thongsandals | CumFacials | wow | CatsInBusinessAttire | ... | Basketball | walterjohnson | tesothemereddit | ipad | Habs | BasketballTips | mac | Buttcoin | JL2579 | ebonyfeet |
7 | reddevils | LiverpoolFC | footballtactics | mfacirclejerk | videos | FlyingFuck | ebonyfeet | wetandwild | wowscrolls | douchebagfoundation | ... | AskMen | booklists | TGOD | AlienBlue | rangers | heat | apple | IWantToBeAMod | chickflixxx | TruePokemon |
8 | Gunners | soccer | soccer | shittymspaints | todayilearned | KittyPorn | EvolutionofKits | TVnewsbabes | WowUI | nnDMT | ... | BasketballTips | artshub | kdubz1298 | mac | canucks | lakers | macapps | nexusq | gonewildaudio | pokemonteams |
9 | FantasyPL | MCFC | Gunners | europeanmalefashion | atheism | gnomewild | ShinyPokemon | LolitaCheng | WoWGoldMaking | AustralianFilm | ... | StateFarm | Adamphotoshopped | asoiafreread | iphone | DetroitRedWings | benchgifs | applehelp | prestashop | pregnant | pokemontrades |
10 | coys | Gunners | soccercirclejerk | malehairadvice | Omaha | HobosGoneWild | nuzlocke | booty_gifs | wowstrat | AscensionOmaha | ... | Mavericks | metacsec | Dreadfort | commonsense | AnaheimDucks | mildlyimpressive | macsetups | sergaljerk | sexpertslounge | pokemonrng |
11 | Barca | chelseafc | reddevils | ldshistory | aww | PicsOfHorseVaginas | Pokemongiveaway | SpyShots | woweconomy | JKP | ... | benchgifs | breakingbad | CK2GameOfthrones | appletv | losangeleskings | LAClippers | iPhoneDev | purple | DeadBedrooms | EvolutionofKits |
12 | chelseafc | Fifa13 | Barca | malefoodadvice | UIUC | HalloweenPorn | pokemonrng | plugged | wowguilds | indiansports | ... | NBA2k | mildlyinteresting | CTI | GeekTool | OttawaSenators | NBASpurs | ipad | labradoodles | sex | pokemonarts |
13 | realmadrid | realmadrid | football | itafterdark | houston | WeirdSubreddits | pokemontrades | exgirlfriendpictures | WoWStreams | iknewgoatsweretrouble | ... | lakers | laundryview | WesterosiProblems | JamieChung | penguins | DavidsQuotes | WebApps | TrackThrows | TwoXSex | Pokemoncollege |
14 | LiverpoolFC | FantasyPL | NUFC | TeenMFA | Columbus | WhyWouldYouFuckThat | stunfisk | RedHeelsGW | wowpodcasts | traphentai | ... | nba | vinyldjs | Asoiafspoilersall | jailbreak | BlueJackets | torontoraptors | iphone | bttf | CowLand | nuzlocke |
15 | MCFC | coys | LiverpoolFC | preppy | Dallas | SantaPorn | Pokemoncollege | CumSwallowing | wowraf | Turkey | ... | heat | SexyMusicVideos | livechat | osx | Flyers | SchooledUp | iphonehelp | RegretfulSexStories | diamondmine | ShinyPokemon |
16 | footballmanagergames | FIFA12 | footballmanagergames | adventureporn | politics | burningporn | normalboots | souse | 24hoursupport | KateeOwen | ... | torontoraptors | bookhaul | GavinQuotes | booklists | devils | memphisgrizzlies | jailbreak | drawings | FanPuns | Pokemongiveaway |
17 | bootroom | EA_FIFA | Bundesliga | breitling | Austin | CucumberPorn | PokemonROMhacks | xxxstash | bubbleswithfaces | shewantstofuck | ... | LAClippers | murakami | Tumba | AppHookup | EA_NHL | warriors | AlienBlue | computertechs | MinecraftChampions | stunfisk |
18 | Fifa13 | FIFA | chelseafc | uniqlo | kansascity | PigsGoneWild | PokemonLeagueDS | drawngonewild | worldofwarcraft | Israel | ... | AskWomen | circlejerkbreakingbad | HypnoHookup | iphonehelp | Coyotes | kanyewest | AppHookup | Sandwiches | WolfPAChq | GenerationOne |
19 | fcbayern | Barca | MilitaryGear | gayforsaulyd | Tucson | Clotheshangers | Pokemonexchange | Cumonboobs | WoWNostalgia | haligonients | ... | SneakerDeals | skinnytail | OliviaMunn | simpleios | BostonBruins | Mavericks | obits | Kirbs2002 | Swingers | pokemonconspiracies |
20 | football | bootroom | MCFC | malelivingspace | Purdue | WTF_Wallpapers | denpamen | manbetterporn | FTH | Filipinology | ... | dating_advice | booksuggestions | maturewoman | iOSthemes | SanJoseSharks | bostonceltics | commonsense | tesothemereddit | DissidiaCraft | pokemonchallenges |
21 | FIFA | footballmanagergames | realmadrid | Watches | orlando | thismemeshoulddie | pkmntcgtrades | DragonageNSFW | redditguild | typescript | ... | DavidsQuotes | bookclub | Spartacus_TV | SEGA32X | stlouisblues | nbacirclejerk | SEGA32X | ABotOfIceAndFire | boule | Pokemonexchange |
22 | FIFA12 | fcbayern | KitSwap | swagteamsix | Boise | windowshots | GenerationOne | ToeSucking | Rift | HealthyWeightLoss | ... | Oaxaca | books | freakyfetishstories | apps | winnipegjets | kings | macgaming | asoiaf | minecraftium | pokemonrp |
23 | EA_FIFA | Aleague | LigaMX | paulrudd | Atlanta | traversecity | TruePokemon | assgifs | ALS | arabic | ... | mildlyimpressive | BooksAMA | Pokenawa | macgaming | hawks | chicagobulls | jasmineapp | gameofthrones | TriPixel | PokemonROMhacks |
24 | borussiadortmund | ACMilan | donenad | mensfashionadvice | Charlotte | ThisDayInHistory | pokemonconspiracies | joi | MMORPG | Syria | ... | NBASpurs | publishing | Banshee | iosgaming | NewYorkIslanders | OrlandoMagic | ScreenplayCoverage | SurplusEngineering | mathias | fireemblem |
25 | SoccerBetting | TrueGunners | fcbayern | soccergaming | fsu | failedpilots | pkmntcgcollections | cumov | MovieWallpapers | mexico | ... | AlienExchange | DogsWithCatHeads | CelebsInTights | Panera | ColoradoAvalanche | suns | iOSthemes | Dreadfort | GWAsians | pkmntcg |
26 | ACMilan | footballtactics | seriea | FantasyPL | Louisville | wikipedia | PokePlayThru2013 | asstastic | wowtcg | pernicus | ... | kanyewest | bookshelf | Conservativebooks | classics | hockeygoalies | GoNets | flextweak | kuro5hit | lmm | pkmntcgtrades |
27 | Aleague | borussiadortmund | LeedsUnited | supremeclothing | SaltLakeCity | bronycringe | pkmntcg | SymphonieVonBondage | punkshots | Palestine | ... | OrlandoMagic | illusionporn | justified | askjailbreak | sabres | NYKnicks | zsh | PlayPassOrPause | mccountercraft | PokemonLeagueDS |
28 | NUFC | soccergaming | ACMilan | PrettyOlderWomen | VirginiaTech | carlhprogramming | pokemonbattles | MNGoneWild | turnoverpie | 1428 | ... | SchooledUp | antiatheism | lickingdick | iWallpaper | caps | RNBA2KFantasy | Panera | kdubz1298 | Hypermine | SoloPokes |
29 | footballtactics | NUFC | bootroom | malegrooming | Hawaii | pic | pokemonrp | gonewild | diablo3 | BDS | ... | bostonceltics | infographic | illustratingreddit | CSUDH | fantasyhockey | rockets | CSUDH | Kazakhstan | massage | hotmidgets |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
15092 | shittygunpictures | TreesSuckingOnThings | bathroomgraffitiporn | Iron | Bad_ass_girlfriends | LadiesofScience | CuriosityCube | craftit | xenominer360 | fragworks | ... | asianproblems | browncoats | blackgirlgamers | Adamphotoshopped | dpmansen | baby | Holmes | DoctorWhoFreedom | StLouisRams | nfl |
15093 | Shotguns | lisp_ja | DAE | ReVenture | services | trashynovels | computer | wtfgames | Trathira | ... | BanjoMonkey | euphonium | LadiesofScience | CAmmunity | Jaguars | steak | ournameisfun | paganmusic | 49ers | mccountercraft | |
15094 | scguns | comedywriting | polycomics | Neopsychedelia | Pornoeverywherexxx | NaturalBeauty | ykwih | CherylCole | starcraft2_class | Chydrego | ... | WeirdFiction | classicwho | ForeverAloneWomen | cachatfanfic | miamidolphins | FoodVideos | horrorlit | ios | CatTeamBrotherhood | Browns |
15095 | progun | freesoftware | PlayPassOrPause | Fitness | Hardcore_PornGifs | IdeaExchange | pcgamingtechsupport | electricents | DNE | DLGOKN | ... | Scoobydoo | thelook | fashion | HouseMD | pickupfootball | NFLhuddle | gallifrey | apple | nfl | lmm |
15096 | 300BLK | tinycode | DYR | homegym | FitnessGirls | duckface | adelaidenews | TwoXSex | quicklooks | Shaskel | ... | RadicalFeminism | maisiewilliams | adventureporn | GunslingerMusic | Hawkeye_Football | PacificNorthwest | AmyAdams | macsetups | BIRDTEAMS | miamidolphins |
15097 | opencarry | steamforlinux | wtfgames | AlternativeHealth | ShemaleSwag | FashionPlus | XXX_Animated_Gifs | TheGirlSurvivalGuide | Hobbies | androidtablets | ... | sidewayspony | gallifrey | vegproblems | cell | Lincolnshire | caps | DoctorWhumour | macapps | fcs | minnesotavikings |
15098 | gunpolitics | systems | DNE | fargo | Entrenched | mydrunkkitchen | infinityblade | ABraThatFits | starcraft_strategy | PleX | ... | malepolish | hilaribad | 2XLite | irvine | eroticcomics | AskMen | lexlinguae | crazysteve | NYGiants | TriPixel |
15099 | Glocks | allthingsterranbot | The_Cheated | ecig_vendors | spacedikdiks | NonverbalComm | CableManagement | ProjectRunway | Destiny | Philanthropy | ... | FemmeThoughtsFeminism | teefury | shittymspaints | avatarvideos | Texans | dating_advice | write | gamegenome | Chargers | mathias |
15100 | CZFirearms | coding | WriteWithMe | Hutchinson | frontiama | circlejerk | HarvardTonight | StreetArtPorn | skypepartycirclejerk | numerical | ... | hugenaturals | ASMRmusic | EdmontonGoneWild | nwo | syrianrefugees | ravens | logophilia | drwho | Mariners | boule |
15101 | tc_archery | cloudcomputing | PicsOfHorseVaginas | askedreddit | front | InfertilityBabies | worms | GetEmployed | asd | yoyohoneysingh | ... | olderlesbians | DoesAnybodyElse | Cheetahs | juanbuiza | UniversityOfHouston | nflblogs | litvideos | Sherlock | EvilLeagueOfEvil | minecraftium |
15102 | knives | puremathematics | FlyingFuck | kettlebell | frontscience | closetswap | eddfaction | entwives | castit | Valesrubberduckies | ... | SeattleHistory | MLPArt | femmit | FlashForward | Madden_NFL | bengals | doctorwho | toolSquad | gifrequests | sportsbook |
15103 | gundeals | PostgreSQL | HotDogPorn | StudentNurse | 2012askanything | technews | buildapcsales | 2XLookbook | StarcraftCirclejerk | Surface | ... | Cheetahs | MovieWallpapers | veggieteens | louie | ConnectedCareers | postcolonialism | neilgaiman | Johnlock | sexmix247 | JordanCox2 |
15104 | Mini14 | mathisbeautiful | FuckingFish | kingcounty | bipoliticalandcurious | GillianJacobs | ClearBackblast | sheltie | HotSBetaKeys | bttf | ... | OperationSayFuckALot | Digihentai | femalefashionadvice | Jericho | ravens | freddiemercury | 80sElectro | classicwho | cowboys | NFL_Draft |
15105 | guns | dalcs4168 | SomeRandomReddit | volunteeringsolutions | johndollarfullofcrap | AlienExchange | Zendaya | Mommit | itmejp | AbletonProductions | ... | TwoXChromosomes | bedhg | frugalbeauty | iamgoingtohellforthis | boats | JordanCox2 | booksuggestions | doctorwhocirclejerk | Tennesseetitans | MinecraftChampions |
15106 | CompetitionShooting | ocaml | SnooPorn | todayiwatched | alfrankenstein | TPOP | Jab | LumiaLovers | slothswitharms | PiCases | ... | BreastPumps | DoctorWhumour | Endo | huntedseries | Seahawks | effzeh | ROIO | whovianents | CollegeBasketball | DissidiaCraft |
15107 | CCW | vim | Clotheshangers | RAMD | GreenCarLovers | preppy | HardcoreSex | USMilitarySO | StarcraftDeutschland | Ermahgerd | ... | Rights4Men | Humiliation | BodyAcceptance | RandomActsOfBidets | podcast | LinusDiaries | orderofthephoenix | karengillan | SaintLouisRams | WolfPAChq |
15108 | ak47 | scribblenauts | PigsGoneWild | fitmeals | AutomobileTechnology | TheSexCave | AvaSambora | VintageLadyBoners | starcraft | AvaDevine | ... | grooveshark | GeeKnitting | malefashionadvice | maggielawson | ConnectedCareers2 | sabres | hyperfurs | dwlounge | Patriots | CowLand |
15109 | saiga | pervasivecomputing | CucumberPorn | xxfitness | albanianarchandpics | StrandedWhale | WhatsInThisPool | rant | allthingsprotoss | GNURadio | ... | GirlsinPinkUndies | t:latestoneage | 2XLookbook | HHN | circuloidiota | Redskins | Random_Acts_of_Books | davidtennant | internetdefense | 49ers |
15110 | Gunsforsale | emacs | UrethraPorn | shittyshitredditsays | engineersissues | Admin | MoundofVenus | TomeNET | wowbro | Information_Security | ... | twilightzonedates | IsabelleFuhrman | feetish | circlejerkbreakingbad | sext | Flyers | artshub | IsabelleFuhrman | Texans | NYGiants |
15111 | gats | scheme | burningporn | ShittyTheoryOfReddit | nonseq | Gingers | buildapcforme | ThriftStoreHauls | BarCraft | braddoingbradthings | ... | henna | bronycringe | rawdenim | Dexter | Colts | NFLFandom | selfpublish | mattandbenedict | nyjets | LinusDiaries |
15112 | prepping | xmonad | SantaPorn | microgreens | TotalFark | SRSFartsAndCrafts | TwinGirls | birthcontrol | allthingszerg | VicPD | ... | trashynovels | cissp | findfashion | thingsoncats | nflblogs | detroitlions | OldEnglish | gallifrey | CFB | diamondmine |
15113 | Firearms | CasualMath | HobosGoneWild | naturalbodybuilding | OmegleVideos | MichelleTrachtenberg | aquajewhungerforce | cafe | starcraftnakama | internet2012 | ... | PunkLovers | bootstrap | asiantwoX | Pedberg | LinusDiaries | Bigtitssmalltits | bookshelf | gallifreyan | NFL_Draft | eagles |
15114 | reloading | ProgrammerHumor | gnomewild | EatCheapAndHealthy | CandidBikiniGirls | dataisterrifying | internetdefense | stepparents | Tumba | amateurastronomy | ... | TrackThrows | CREST | TwoXChromosomes | Allison_Scagliotti | JordanCox2 | 40something | BooksAMA | Torchwood | CHIBears | oaklandraiders |
15115 | ar15 | gnu | WTF_Wallpapers | PronePaddling | aitaiwan | LifeProTips | gamingpc | bigboobproblems | starcraft2clans | cavalierkingcharles | ... | femalewriters | AmyAdams | TodayIWore | pipesgonewild | effzeh | GreenBayPackers | books | mattsmith | Redskins | cowboys |
15116 | longrange | nethack | KittyPorn | DeathProTips | baozoumanhua | mcwex | buildapc | childfree | AllThingsTerran | BitcoinMagazine | ... | parentdeals | wmnf | malefashion | ohnoghosts | NFLFandom | knockoutgifs | alt_lit | wholock | eagles | Patriots |
15117 | Hunting | baduk | WhyWouldYouFuckThat | DepressedRage | NSFW_HotSlut_gonewild | wmnf | sexmix247 | 2XLite | Naytopia | trueshreddit | ... | amateursologirls | FailedFedora | blackgirls | vinyldjs | freddiemercury | happierNYC | 52book | MovieWallpapers | nflcirclejerk | Redskins |
15118 | BrassSwap | libredesign | WeirdSubreddits | 2Chainz | sexy_nonnude_teens | allstars | Reflections | curvygirls | Limu | technology | ... | SexyNerds | doctorwho | SRSTechnology | breakingbadcomics | NFLhuddle | buffalobills | bookhaul | hyperfurs | Reflections | CFB |
15119 | GunPorn | gtd | HalloweenPorn | IndianClubs | NSFW_nude_asian_teens | UKHealthcare | overclocking | elections | DoesAnybodyElse | TakeOneStepForward | ... | MilitaryFamilies | dwlounge | UnsentMusic | breakingbad | Bingo | MontrealCanadiens | BookCollecting | AmyAdams | panthers | Texans |
15120 | USMilitia | compscipapers | SpidersGoneWild | Nicaragua | NSFW_sexynudeteens | CarboholicsAnonymous | NSFWSector | mormoncringe | UKStarcraft | slandrogangstas | ... | veggieteens | hugs | goodyearwelt | memphismayfire | gameslist | Articles | booklists | doctorwho | HighlightGIFS | nyjets |
15121 | 1911 | LANL_French | DoesAnybodyElse | LifeProTips | wild_naked_girls | eisley | gifs | mobile | allthingsterranbot | preggocows | ... | SRSTechnology | hyperfurs | freeforallfashion | Earlyjazz | happierNYC | steelers | WoTreread | DoctorWhumour | gifs | CHIBears |
15122 rows × 100 columns
pd.DataFrame(subreddits[np.argsort(embedded_coords[:,[0, 1, 44,51,84,50,47,40]], axis=0)[::-1]],
columns=[
"0: big - small",
"1: big - small",
"44: soccer - guns",
"51: programming - food",
"84: music - bikes",
"50: osx - books",
"47: wow - starcraft",
"40: male grooming - life hacks"
])
# not shown but also amusing:
# 14: music - pot
# 24: science - porn
0: big - small | 1: big - small | 44: soccer - guns | 51: programming - food | 84: music - bikes | 50: osx - books | 47: wow - starcraft | 40: male grooming - life hacks | |
---|---|---|---|---|---|---|---|---|
0 | AskReddit | todayilearned | bevandele | threads | BibleBelievers | osx | ShitLiamDoes | malefashionadvice |
1 | funny | worldnews | ParisSG | smart | christianblogs | simpleios | silverandguns | malefashion |
2 | pics | politics | soccerbot | hocnet | mormonapologetics | ios | techsupport | goodyearwelt |
3 | WTF | videos | FootballMedia | pervasivecomputing | beerpong | retina | uscgames | rawdenim |
4 | gaming | technology | FTLStrikers | Suomipelit | anarchist_aid | iOSProgramming | Goblinism | frugalmalefashion |
5 | AdviceAnimals | blog | PrettyOlderWomen | lisp_ja | Jesus | appletv | Transmogrification | yusufcirclejerk |
6 | IAmA | promos | soccer | gnu | TrueChristian | mac | wow | AustralianMFA |
7 | videos | science | reddevils | ProgrammerArt | biogas | apple | wowscrolls | mfacirclejerk |
8 | todayilearned | til | Gunners | app | Christianity | macapps | WowUI | shittymspaints |
9 | atheism | atheism | FantasyPL | jenkinsci | JustChristians | applehelp | WoWGoldMaking | europeanmalefashion |
10 | Omaha | SOPA | coys | DanceTutorials | ChristianBooks | macsetups | wowstrat | malehairadvice |
11 | aww | IAmA | Barca | illjustleavethishere | ChristianCreationists | iPhoneDev | woweconomy | ldshistory |
12 | UIUC | ColbertRally | chelseafc | cloudcomputing | therealcollective | ipad | wowguilds | malefoodadvice |
13 | houston | USNEWS | realmadrid | softwaredevelopment | Reformed | WebApps | WoWStreams | itafterdark |
14 | Columbus | CMDY | LiverpoolFC | ComputerTips | stickers | iphone | wowpodcasts | TeenMFA |
15 | Dallas | NorthKoreaNews | MCFC | scheme | TheArk | iphonehelp | wowraf | preppy |
16 | politics | OperationGrabAss | footballmanagergames | ProgrammerHumor | RadicalChristianity | jailbreak | 24hoursupport | adventureporn |
17 | Austin | WikiLeaks | bootroom | ocaml | truestchristian | AlienBlue | bubbleswithfaces | breitling |
18 | kansascity | sandy | Fifa13 | tmbo | Sidehugs | AppHookup | worldofwarcraft | uniqlo |
19 | Tucson | AnythingGoesUltimate | fcbayern | QueerTheory | biblestudy | obits | WoWNostalgia | gayforsaulyd |
20 | Purdue | reddit.com | football | emacs | Catacombs | commonsense | FTH | malelivingspace |
21 | orlando | northkorea | FIFA | csbooks | ChristianApologetics | SEGA32X | redditguild | Watches |
22 | Boise | athiesm | FIFA12 | d_language | PrayerRequests | macgaming | Rift | swagteamsix |
23 | Atlanta | Futurology | EA_FIFA | coding | Fiveheads | jasmineapp | ALS | paulrudd |
24 | Charlotte | silentcinema | borussiadortmund | CatsWithPeopleFeet | sketches | ScreenplayCoverage | MMORPG | mensfashionadvice |
25 | fsu | occupywallstreet | SoccerBetting | feministFAQ | OpenChristian | iOSthemes | MovieWallpapers | soccergaming |
26 | Louisville | softscience | ACMilan | Big_Ood | cristianoronaldo | flextweak | wowtcg | FantasyPL |
27 | SaltLakeCity | peoplesliberation | Aleague | securityCTF | ShitZimnySendsMe | zsh | punkshots | supremeclothing |
28 | VirginiaTech | ArtisanVideos | NUFC | Clojure | ahmadiyya | Panera | turnoverpie | PrettyOlderWomen |
29 | Hawaii | movies | footballtactics | blackflag | ChristianMusic | CSUDH | diablo3 | malegrooming |
... | ... | ... | ... | ... | ... | ... | ... | ... |
15092 | Bad_ass_girlfriends | Foamyfrogismoe | shittygunpictures | castiron | Boats_and_Beauties | Holmes | xenominer360 | Iron |
15093 | services | SiriAmA | Shotguns | ramen | MagicCardPulls | ournameisfun | wtfgames | ReVenture |
15094 | Pornoeverywherexxx | HelloProperAlice | scguns | Halloween_Town | Magicdeckbuilding | horrorlit | starcraft2_class | Neopsychedelia |
15095 | Hardcore_PornGifs | allHailKingNoel | progun | asianeats | melodicmetal | gallifrey | DNE | Fitness |
15096 | FitnessGirls | LambofGod | 300BLK | HiphopWorldwide | TechnicalDeathMetal | AmyAdams | quicklooks | homegym |
15097 | ShemaleSwag | VictorianSluts | opencarry | shittyhomes | Alisonhaislip | DoctorWhumour | Hobbies | AlternativeHealth |
15098 | Entrenched | dogdad | gunpolitics | grilling | mtgcube | lexlinguae | starcraft_strategy | fargo |
15099 | spacedikdiks | VintageMilf | Glocks | KitchenConfidential | magicTCG | write | Destiny | ecig_vendors |
15100 | frontiama | Onderka | CZFirearms | icecreamery | Deathmetal | logophilia | skypepartycirclejerk | Hutchinson |
15101 | front | softcorenights | tc_archery | Chefit | mtgaltered | litvideos | asd | askedreddit |
15102 | frontscience | CarShowHunnies | knives | recipes | spikes | doctorwho | castit | kettlebell |
15103 | 2012askanything | RimofReality | gundeals | VictoriaSecret | EDH | neilgaiman | StarcraftCirclejerk | StudentNurse |
15104 | bipoliticalandcurious | HappyBaracky | Mini14 | TakeaPlantLeaveaPlant | Metal | 80sElectro | HotSBetaKeys | kingcounty |
15105 | johndollarfullofcrap | shitclintonsays | guns | sushi | metalmusicians | booksuggestions | itmejp | volunteeringsolutions |
15106 | alfrankenstein | babel | CompetitionShooting | Breadit | AliceInChains | ROIO | slothswitharms | todayiwatched |
15107 | GreenCarLovers | notocispa | CCW | DavisSquare | Musex | orderofthephoenix | StarcraftDeutschland | RAMD |
15108 | AutomobileTechnology | ProjectScarlett | ak47 | GMO | ISU_GDC | hyperfurs | starcraft | fitmeals |
15109 | albanianarchandpics | bestplacetolearn | saiga | 52weeksofbaking | MetalLadyBoners | Random_Acts_of_Books | allthingsprotoss | xxfitness |
15110 | engineersissues | mediocregiraffes | Gunsforsale | judytrinh | 7String | artshub | wowbro | shittyshitredditsays |
15111 | nonseq | HIV | gats | Daleks | mtgfinance | selfpublish | BarCraft | ShittyTheoryOfReddit |
15112 | TotalFark | superheroinesdefeated | prepping | Charcuterie | spacedikdiks | OldEnglish | allthingszerg | microgreens |
15113 | OmegleVideos | violentpornography | Firearms | 52weeksofcooking | Entrenched | bookshelf | starcraftnakama | naturalbodybuilding |
15114 | CandidBikiniGirls | necroporn | reloading | sousvide | Cichlid | BooksAMA | Tumba | EatCheapAndHealthy |
15115 | aitaiwan | deandrefall | ar15 | FoodPorn | vomitporn | books | starcraft2clans | PronePaddling |
15116 | baozoumanhua | kasperrosa | longrange | food | TheStuntMuffins | alt_lit | AllThingsTerran | DeathProTips |
15117 | NSFW_HotSlut_gonewild | novafactory | Hunting | CajunMusic | BestSleepOfYourLife | 52book | Naytopia | DepressedRage |
15118 | sexy_nonnude_teens | marathi | BrassSwap | FoodVideos | nationalsciencebowl | bookhaul | Limu | 2Chainz |
15119 | NSFW_nude_asian_teens | PsiUEI | GunPorn | sharksinclothes | johndollarfullofcrap | BookCollecting | DoesAnybodyElse | IndianClubs |
15120 | NSFW_sexynudeteens | Factories | USMilitia | creepywiki | alfrankenstein | booklists | UKStarcraft | Nicaragua |
15121 | wild_naked_girls | AskReddit | 1911 | appetizers | Metal_Alberta | WoTreread | allthingsterranbot | LifeProTips |
15122 rows × 8 columns
import bokeh.plotting as bp
from bokeh.objects import HoverTool
bp.output_notebook()
row_selector = np.where(users_per_subreddit>100)