LinkedIn’s Fake Profile Problem

Something suspicious is going on with LinkedIn invitations. I’ve had about 5 of these invitations with the exact same text, but different profiles and photos:

Which is real? Accounts associated with Ukraine appear to be bot-generated.
Two recent, identical invitations. Last year, I received additional invitations with identical wording. It seems there is some automated process that involves automated profile generation on LinkedIn.

The photos can be reverse-image search traced to other LinkedIn profiles with different names or pages from other people on the web.

The text opens with, “hi {name}, I am from Ukraine, just wanted to contact you and be in touch, maybe we could work together in the future.”

This was previously noticed by @larsjuhljensen:

The images from profiles are typically also found on other sites.

A Ukraine flag is used as the banner. Looking at one of the profiles, we see that it references “Temy” as a company. This is different from the “currently employed” company listed. It is a commonality of these profiles.
It seems the source could have been this .ru domain (other sites appeared, but this one had the uncropped photo).

Liubov, or Sophia?

Who is the imposter in this case? (Recall “Liubov” from the first image.)

First, here is the profile from the “network connection request” on LinkedIn. Notice the “Temy” reference in the About section, which doesn’t line up with the company.
Reverse image search: we see that the original image, likely the uncropped one, is found elsewhere. Note the Moscow reference.
Another LinkedIn profile using the same photo (but uncropped), but different name and title.

Kate, or Anna?

Below is another instance of the same type of profile, where the user’s image appears on another site with a different name.

A typical profile showing the Ukraine flag with the standard “About” section.
Same image, different name, different website, different job!
Another profile on LinkedIn. The profile history includes a background in Moscow.

Possible Explanations

It’s too early to draw any conclusions, but I’d be curious if others have ideas.

Some commonalities include:

  • use of the Ukraine flag as a background
  • profile sources somehow being connected with Moscow
  • copied invitation text and About text in the profile
  • a company mismatch in the profile

I’ve reached out to a couple people from this investigation and am waiting for replies.


[not] Trusting Reddit

Because of the issues we find with trusting Amazon reviews, we often turn to reddit for product opinions. But, as with all growing markets, bad actors find a way to infiltrate them. A superficial Google Trends comparison provides evidence in favor of the hypothesis that advertisers are increasingly turning to Reddit for advertising, but not the “Right Way.”

Evidence that the growth rate of interest in buying reddit accounts is outpacing advertising on reddit.

So, I surveyed the marketplace. There are a lot of accounts for sale!

Accounts for sale on one of the marketplaces that came up in a search. (“Karma” is basically “reputation points.”)

It makes you wonder: how much of Reddit is “real”?

Note: I decided it’s probably better to bring awareness to the issue, so that countermeasures can be developed over time, including user skepticism. Reddit has an interest in both maintaining community trust, and in promoting its advertising as preferable to the black-hat approach. On the other hand, this also brings awareness to black-hats, although they probably already knew about this through other channels.

Note to Reddit’s ML team: I think you could detect these kinds of shenanigans by looking at signals coming from a change in IP address, browser fingerprint, and a shift in content (especially toward promotional or political posts) all right around the same time period. You could even set up a honeypot to get some ground truth…


Beware “Amazon’s Choice”

I recently bought a stud finder from Amazon based on it having the “Amazon’s Choice” tag – a reliable shortcut, right? Then the product arrived.

What struck me as initially weird about this product was that the instructions were all in an extremely small font – I couldn’t read them without straining my eyes, and there was no website link.

So, I searched online for the model number, TH220, but instead found references to another product by “FOLAI” with some poorly made YouTube reviews. That product looked exactly the same, except for the color and labeling.

I went back to the Amazon product page for “The Original Pink Box” stud finder and scanned the reviews (emphasis on “Original” is mine; you’ll understand the irony shortly). It was apparent that the majority of the reviews were for other unrelated products like a tape measure. Then, I noticed something strange in the “Style” selector tool for the stud finder. It appears that a hack was applied to have multiple, unrelated products all linked to this one by using the “style” feature.

These various products were all attributed to this stud finder by Amazon. Thus, the stud finder receives irrelevant reviews by using the “Style” feature. Most of the reviews are for tape measures and screwdrivers.

I’m not sure this is a case of “review hijacking,” since the reviews were presumably for other products by the same company.

I contacted Amazon support to ask what was going on, but support was initially useless: they replied to my inquiry with a description of the product. After further discussion, they assured me about the quality of their inventory and offered that I could return it for a refund, and then gave me a $5 credit.

The instructions themselves – although too small to read comfortably – had reasonably clear English, suggesting a possible American manufacturer. So, I searched for some of the text strings from the manual and found it copied to more clones of this product:

  • “Shibeier” stud finder on Sears’ website
  • “INTEY” stud finder
  • “FOLAI”
  • “Baqsoo” (also found on Amazon through a search of the manual that revealed some Q&A was using product page keyword stuffing from the manual)
  • “TAVOOL”
  • “ZBYZF”
  • (SainSmart ToolPac SMA19) – finally – what appears to be the original source of the instruction manual: the same instructions, fewer typos (an artifact of OCR?), and a different model number cut & pasted.

So, which is the real manufacturer? And which are knockoffs? And when can you trust “Amazon’s Choice?” What does “choice” imply, anyway? It’d seem to imply some editorial review by Amazon, but since Amazon won’t say, it is reasonable to speculate that it is actually an algorithmic semi-sponsored result, in the sense that Amazon strikes some balance between expected profit margin and rating.

What’s a consumer to do? I checked ReviewMeta and Fakespot for reviews, but both had equally failed the test:

ReviewMeta trusted nearly all the irrelevant reviews.

Above is ReviewMeta. Below is Fakespot.

Fakespot also trusted the bad reviews.

So what exactly is going on? My guess is that “manufacturers” are being invented at an increasing pace in order to flood the search results with listings of the same product, and Amazon is struggling to keep up with the pace.

It seems we’re seeing a rise in counterfeits on Amazon. Increasingly, it’s smart to revert back to name brands, ignore online reviews, and shop from retailers who do the vetting of products that they sell.

So, did the stud finder work? I’m not sure whether I can trust it to detect wires in the wall, since the results seem strange and filled with a lot of false positives. As a non-expert, I need a name brand for comparison to see if the readings I’m getting are reliable.


The Coming Rise of Drone Pirates?

Prepare for Amazon’s drone delivery system. Along with it, prepare for a new form of theft: drone piracy.

As your package makes its way toward home, drone interceptors will snag packages – and possibly the delivery drone itself.

Why shouldn’t we expect this? Criminals love the anonymity they have on the ‘net. Similarly, drone pirates will be able to stay relatively distant from their crimes. The only question is to what extent we should expect drone piracy; not whether it will happen.

To thwart this new type of crime, delivery systems will no doubt implement countermeasures:

  • low-risk package types
  • safe flight paths
  • GPS tracking

By employing drone delivery only for packages with low street value, there will be less incentive to attempt this kind of piracy. If most packages are low value, then piracy can be kept to the realm of low-ROI theft. Still, it would be in the courier’s interest to push the limits of what can be delivered by air, and no doubt machine learning will be deployed to predict the likelihood of accident, including by not limited to theft and injury-by-falling-drone.

By following safer flight paths, there is lower liability associated with drone interceptors colliding mid-air and resulting in falling drones from the skies (note that Amazon appears to be testing in rural areas first.)

By employing GPS tracking, delivery companies can hopefully reduce the “anonymity” appeal of drone hijacking. This can include the insertion of returnable GPS tracking devices into the packages themselves at random.

See Amazon Prime Air.


Selection Bias in Ads

I received a mailer ad from an insurer that reported “an average of $507 in savings.” That headline grabbed my attention.

However, I read the footnotes, which tells us that the statistic comes from those,

…who reported savings by switching their auto insurance…

Subtly, multiple selection effects are present:

  1. only those who experienced savings
  2. only those who reported their savings

Those who did not experience savings may have experienced the opposite, but were excluded from the sample. Of those who reported savings, we can expect that the sample will bias toward those who have experienced the greatest benefit, and considered their savings worthy of reporting.

Thus, it seems that this ad campaign benefits from the statistical illiteracy of its audience to overestimate their expected savings.

The full footnote says,

Average annual per household savings based on a 2018 national survey by State Farm of new policyholders who reported savings by switching to State Farm.

In the below screenshot, see the footnote in a variation of this ad:

This version, which I pulled from their site, adds in a third qualifier to the effect size, “up to” (!) By doing this, they can present an even higher dollar amount – even further from the expected savings the average consumer can experience.

Honestly, I’m not even sure how to interpret this myself: what does “up to an average value of $X” mean in this context?


Amazon’s Fake Review Problem

Update – the fake reviews featured below were removed shortly after this post blew up on Hacker News.

One reason we buy from Amazon: plenty of reviews. But what if many of Amazon’s top-reviewed items have fake, paid reviews?

I was looking for a sunrise alarm clock this morning and started searching through the many reviews, filtering by ones that mentioned “minutes,” since I wanted to learn about the product’s timer feature. This surfaced a bunch of similar-looking reviews:

Here, we see both the top and bottom review with the sentence,

The light can be pretty bright, you can adjust it where it’ll be dim and slowly brighten 30 minutes before the alarm time.

Did “Becky” and “Dione Milton” really both happen to write a review with the exact same 23-word sentence? Or, is it more likely that they are agents sourcing reviews from a script, and they sloppily pasted their reviews without rewriting them (as they were presumably instructed to do)? Note also the post dates: December 12, 2017. “Becky” and “Dione Milton” both had private profiles, where their 5-6 reviews were hidden – very similar looking.

Amazon – who has some of the world’s most advanced ML – really needs to step up its review fraud detection game. Imagine how great the Amazon shopping experience would be if we could trust its reviews.

Third party meta review sites like Fakespot will identify problems for us (in this case, the product got an “F” grade) – so why doesn’t Amazon?

Amazon: you can do better.

Update 2020-09-23: you might want to also watch “why Amazon has a fake review problem


Garbage sites on Flippa

I subscribed to announcements of websites for sale on Flippa. Over the past several months, I’ve noticed an interesting pattern.

Here is a notification that arrived in my inbox today:

questionable auctions
Flippa is filled with auctions for websites promising easy revenue. Above are two sites by the same seller.

The email summarized a couple listings for websites being auctioned. Both of the above listings make promises in the headlines for 100% “automated” and “work-free” sites. If the owner had such successful, automated sites, why would he be motivated to sell them? This should at least raise suspicion.

What about the traffic statistics and financials? below is a screenshot of what the author claimed:

Claimed earnings
Claimed earnings vs traffic

Note the spike in traffic for the past month. 0 to 15k visits in a month! What a massive upward trend, right? But wait, it seems that the traffic has extremely little correlation to ad revenue. We see a claimed steadily increasing monthly ad revenue report, which should normally strongly correlate with traffic. One explanation is that the owner just installed Google Analytics last month, and traffic stats started accumulating then. Another explanation is that the author somehow spiked the traffic stats for a month.

Suspicious of the source of this traffic, I checked Alexa, a web traffic analytics company. According to, the vast majority of traffic comes from India. Why might that be?

Click for full-sized image
Inflated traffic from India

One easy way to inflate traffic stats is to artificially and temporarily increase them by hiring cheap labor and/or automated “bots” (computers) to visit the site. Since the same seller had listed each site, was it any surprise that both sites had the vast majority coming from India? This proves nothing, but is certainly alarming, since there’s no particular reason to expect the vast majority of traffic to come from India for sites like these.

The seller claims in the listing title that the site is 100% “automated.” If this is the case, then how is the copy being written? Or, are articles just scraped (copied verbatim) from other sites? One way to see if the content of the website was copied is to search a string of text that should be unique to the site. In this case, a Google search reveals that the copy for the test string I chose is found on over one-thousand other sites. This suggests that the copy for this site is most likely scraped from other sites. Google is known to strongly penalize sites for this behavior.

flippa scams
over a thousand other sites contained this exact string of text

We can take a guess, with pretty high confidence, that the site for sale is not the original creator of the “automated” content.

Once again, the auction listing title states the site is “100% automated” and requires “0 work.” Now, read the disclaimer included at the bottom of the auction:

This is great website for anyone interested in Profit making cash cow  with Health concept I like to be completely transparent in my transactions and don’t like to mislead anyone. What you see is what you get. It does involve effort and money won’t come from no nothing. The key to getting this off the ground is auto content and unique

So, in other words, the “key” to success is not leaving the site on autopilot, and “unique” [sic] – unique content, that is. As demonstrated, this site is far from “unique,” having articles that appear on a thousand other sites.

This type of listing is extremely common on Flippa. I’ve noticed this pattern over the past several months. I’m not the first to observe that the site is rife with scams, though. Feel free to learn more about the scams on Flippa.

On a final note, not everything sold on Flippa is junk – just the majority! Buyer beware!


Prediction, Day Trading, and Confirmation Bias

Why do most day traders persist, despite a lack of success?

I’ve listened to several day traders speak at length about their progress, and heard a common thread: in explaining their slow progress, they speak of the difficulty of mastering their emotions. The traders don’t know each other, but they follow the same approach to learning the discipline: they trade real stocks with real money in real time as they hone their skills.

These traders correctly identify that regardless of the overall accuracy in their trading strategies, they will have ups and downs. However, when explaining long-term failures, they continue to cite a lack of mastery of emotions, without suggesting the possibility of the alternative explanation: a strategy that just doesn’t work. Although these traders are not following a precise algorithm, they could be, if only they were able to define their trading strategy with sufficient precision. The “emotions” factor could then be taken out of play, and the traders could see whether their algorithms were viable from a back testing approach: they would apply their algorithms to a large sample of past situations to see how their portfolios would have performed with fear and greed outside of the picture.

It’s not true that favorable back testing guarantees positive future performance, but it is highly probable that in day trading, back testing that yields negative results implies that the algorithm would not perform well in the future. Yet no time is spent on initial back testing to weed out poor strategies; instead, these strategies are first tested in real time over a span of costly years.

If these traders really wanted to see whether discipline was at the heart of the problem, they could still do it – but they don’t, because it is both emotionally and mathematically challenging to embark on an attempt to disconfirm their hypotheses. However, if only they sought to disconfirm hypotheses from the start, before they became so invested in them, then alternate hypotheses showing more promise could have been tested over the years.

In day trading, periods of success are overly attributed to evidence that the strategy works, and periods of failure are attributed to failure in application of the strategy. The strategy itself is kept insulated from criticism. And because of the difficult in separating the signal from the noise, the illusion easily persists.


Last Cookie Hypothesis

the last cookie
a familiar scene

How often have you walked past the tray of cookies at your office and noticed that there is only one cookie left? My guess is a strangely disproportionate number of times. You could substitute “cookie” for any tray of baked goods in general (brownies, muffins, etc) in this observation.

So, what’s the deal? Why is there so often precisely one cookie left?

I’ve asked others. They usually claim an altruistic reason like “not being greedy.” I call shenanigans!

So, why then was there only one cookie left with? On some level (conscious or not), the others considered the possibility that there is something wrong with this cookie, given that it is still there after all this time. Let’s face it: of all the cookies to have been eaten already, this one was not chosen 20 times. It is also guaranteed to have been sitting there the longest possible time of all the cookies.

One more motivation: nobody wants to clean up the cookie tray. Taking that last cookie leads to some sense of responsibility for cleaning up the mess.

Now you can sleep at night, understanding why it seems there’s always one last cookie sitting on a plate at the office.


The Fallacy of Entrepreneurship’s Expected Value

There are plenty of popular entrepreneurship motivational bloggers who preach along the lines of “just do it.” Some even attempt to mathematically show the rational move is to initiate your startup. At the core of one  popular analysis is a false premise that leads people into deciding that they should “go for it” despite their intuition.

The following is from the popular poker entrepreneur Billy Murphy:

“So, if I was trying to decide whether I should work a job or start a business, could I use EV to help me?”

Absolutely— it’s a perfect spot to use EV.

Here’s how his expected value analysis would apply to deciding whether to start a business:

  1. You can choose to “work for the man” or build your own business
  2. You can calculate your expected earnings (the sum of all potential earnings outcomes times probability of each outcome) as an entrepreneur, and compare this to your known earnings as an employee
  3. If your expected earnings as an entrepreneur are significantly greater in entrepreneurship, then you should go for it

Let’s overlook the success bias that entrepreneurs in general will have in assigning a probability to their chances of success, and assume that the entrepreneur is conservative in his estimates. The problem I want to point out is that of the marginal utility of wealth.

Consider that the incremental value of any dollar amount you receive will decrease each time your account increases by that amount: e.g. your first $50,000 matters a lot more to your well-being and happiness than does the next $50,000. This is known as the diminishing marginal utility of wealth.

In the probability analysis, this is not accounted for. The reason your intuition tells you not to start a business is the same reason that people play it safe and “work for the man.” They have a correct gut feeling that the $50,000 / year guaranteed salary provides greater expected utility to them than entrepreneurship. Going broke in entrepreneurship means losing out on the first $50,000 (having $0 – or worse, $0 and debt). Such a great amount of the expected utility of wealth is front-loaded into that. Your quality of life would probably decline much more in this case ($50,000 to $0) than it would from the effects of going from a $100,000 salary to a $50,000 salary.

Mr. Murphy’s expected value calculations are great in situations like poker games because many games are played over the course of life and you, the exceptional poker player, always come out ahead by playing the greater expected value (assuming you are wise enough never to risk everything in one hand, where you can lose everything). In the game of life, we have to consider the expected utility, not the expected value. And the expected utility of entrepreneurship is very low for most people.