LinkedIn’s Fake Profile Problem

Something suspicious is going on with LinkedIn invitations. I’ve had about 5 of these invitations with the exact same text, but different profiles and photos:

Which is real? Accounts associated with Ukraine appear to be bot-generated.
Two recent, identical invitations. Last year, I received additional invitations with identical wording. It seems there is some automated process that involves automated profile generation on LinkedIn.

The photos can be reverse-image search traced to other LinkedIn profiles with different names or pages from other people on the web.

The text opens with, “hi {name}, I am from Ukraine, just wanted to contact you and be in touch, maybe we could work together in the future.”

This was previously noticed by @larsjuhljensen:

The images from profiles are typically also found on other sites.

A Ukraine flag is used as the banner. Looking at one of the profiles, we see that it references “Temy” as a company. This is different from the “currently employed” company listed. It is a commonality of these profiles.
It seems the source could have been this .ru domain (other sites appeared, but this one had the uncropped photo).

Liubov, or Sophia?

Who is the imposter in this case? (Recall “Liubov” from the first image.)

First, here is the profile from the “network connection request” on LinkedIn. Notice the “Temy” reference in the About section, which doesn’t line up with the company.
Reverse image search: we see that the original image, likely the uncropped one, is found elsewhere. Note the Moscow reference.
Another LinkedIn profile using the same photo (but uncropped), but different name and title.

Kate, or Anna?

Below is another instance of the same type of profile, where the user’s image appears on another site with a different name.

A typical profile showing the Ukraine flag with the standard “About” section.
Same image, different name, different website, different job!
Another profile on LinkedIn. The profile history includes a background in Moscow.

Possible Explanations

It’s too early to draw any conclusions, but I’d be curious if others have ideas.

Some commonalities include:

  • use of the Ukraine flag as a background
  • profile sources somehow being connected with Moscow
  • copied invitation text and About text in the profile
  • a company mismatch in the profile

I’ve reached out to a couple people from this investigation and am waiting for replies.


[not] Trusting Reddit

Because of the issues we find with trusting Amazon reviews, we often turn to reddit for product opinions. But, as with all growing markets, bad actors find a way to infiltrate them. A superficial Google Trends comparison provides evidence in favor of the hypothesis that advertisers are increasingly turning to Reddit for advertising, but not the “Right Way.”

Evidence that the growth rate of interest in buying reddit accounts is outpacing advertising on reddit.

So, I surveyed the marketplace. There are a lot of accounts for sale!

Accounts for sale on one of the marketplaces that came up in a search. (“Karma” is basically “reputation points.”)

It makes you wonder: how much of Reddit is “real”?

Note: I decided it’s probably better to bring awareness to the issue, so that countermeasures can be developed over time, including user skepticism. Reddit has an interest in both maintaining community trust, and in promoting its advertising as preferable to the black-hat approach. On the other hand, this also brings awareness to black-hats, although they probably already knew about this through other channels.

Note to Reddit’s ML team: I think you could detect these kinds of shenanigans by looking at signals coming from a change in IP address, browser fingerprint, and a shift in content (especially toward promotional or political posts) all right around the same time period. You could even set up a honeypot to get some ground truth…


Beware “Amazon’s Choice”

I recently bought a stud finder from Amazon based on it having the “Amazon’s Choice” tag – a reliable shortcut, right? Then the product arrived.

What struck me as initially weird about this product was that the instructions were all in an extremely small font – I couldn’t read them without straining my eyes, and there was no website link.

So, I searched online for the model number, TH220, but instead found references to another product by “FOLAI” with some poorly made YouTube reviews. That product looked exactly the same, except for the color and labeling.

I went back to the Amazon product page for “The Original Pink Box” stud finder and scanned the reviews (emphasis on “Original” is mine; you’ll understand the irony shortly). It was apparent that the majority of the reviews were for other unrelated products like a tape measure. Then, I noticed something strange in the “Style” selector tool for the stud finder. It appears that a hack was applied to have multiple, unrelated products all linked to this one by using the “style” feature.

These various products were all attributed to this stud finder by Amazon. Thus, the stud finder receives irrelevant reviews by using the “Style” feature. Most of the reviews are for tape measures and screwdrivers.

I’m not sure this is a case of “review hijacking,” since the reviews were presumably for other products by the same company.

I contacted Amazon support to ask what was going on, but support was initially useless: they replied to my inquiry with a description of the product. After further discussion, they assured me about the quality of their inventory and offered that I could return it for a refund, and then gave me a $5 credit.

The instructions themselves – although too small to read comfortably – had reasonably clear English, suggesting a possible American manufacturer. So, I searched for some of the text strings from the manual and found it copied to more clones of this product:

  • “Shibeier” stud finder on Sears’ website
  • “INTEY” stud finder
  • “FOLAI”
  • “Baqsoo” (also found on Amazon through a search of the manual that revealed some Q&A was using product page keyword stuffing from the manual)
  • “TAVOOL”
  • “ZBYZF”
  • (SainSmart ToolPac SMA19) – finally – what appears to be the original source of the instruction manual: the same instructions, fewer typos (an artifact of OCR?), and a different model number cut & pasted.

So, which is the real manufacturer? And which are knockoffs? And when can you trust “Amazon’s Choice?” What does “choice” imply, anyway? It’d seem to imply some editorial review by Amazon, but since Amazon won’t say, it is reasonable to speculate that it is actually an algorithmic semi-sponsored result, in the sense that Amazon strikes some balance between expected profit margin and rating.

What’s a consumer to do? I checked ReviewMeta and Fakespot for reviews, but both had equally failed the test:

ReviewMeta trusted nearly all the irrelevant reviews.

Above is ReviewMeta. Below is Fakespot.

Fakespot also trusted the bad reviews.

So what exactly is going on? My guess is that “manufacturers” are being invented at an increasing pace in order to flood the search results with listings of the same product, and Amazon is struggling to keep up with the pace.

It seems we’re seeing a rise in counterfeits on Amazon. Increasingly, it’s smart to revert back to name brands, ignore online reviews, and shop from retailers who do the vetting of products that they sell.

So, did the stud finder work? I’m not sure whether I can trust it to detect wires in the wall, since the results seem strange and filled with a lot of false positives. As a non-expert, I need a name brand for comparison to see if the readings I’m getting are reliable.


The Coming Rise of Drone Pirates?

Prepare for Amazon’s drone delivery system. Along with it, prepare for a new form of theft: drone piracy.

As your package makes its way toward home, drone interceptors will snag packages – and possibly the delivery drone itself.

Why shouldn’t we expect this? Criminals love the anonymity they have on the ‘net. Similarly, drone pirates will be able to stay relatively distant from their crimes. The only question is to what extent we should expect drone piracy; not whether it will happen.

To thwart this new type of crime, delivery systems will no doubt implement countermeasures:

  • low-risk package types
  • safe flight paths
  • GPS tracking

By employing drone delivery only for packages with low street value, there will be less incentive to attempt this kind of piracy. If most packages are low value, then piracy can be kept to the realm of low-ROI theft. Still, it would be in the courier’s interest to push the limits of what can be delivered by air, and no doubt machine learning will be deployed to predict the likelihood of accident, including by not limited to theft and injury-by-falling-drone.

By following safer flight paths, there is lower liability associated with drone interceptors colliding mid-air and resulting in falling drones from the skies (note that Amazon appears to be testing in rural areas first.)

By employing GPS tracking, delivery companies can hopefully reduce the “anonymity” appeal of drone hijacking. This can include the insertion of returnable GPS tracking devices into the packages themselves at random.

See Amazon Prime Air.


Selection Bias in Ads

I received a mailer ad from an insurer that reported “an average of $507 in savings.” That headline grabbed my attention.

However, I read the footnotes, which tells us that the statistic comes from those,

…who reported savings by switching their auto insurance…

Subtly, multiple selection effects are present:

  1. only those who experienced savings
  2. only those who reported their savings

Those who did not experience savings may have experienced the opposite, but were excluded from the sample. Of those who reported savings, we can expect that the sample will bias toward those who have experienced the greatest benefit, and considered their savings worthy of reporting.

Thus, it seems that this ad campaign benefits from the statistical illiteracy of its audience to overestimate their expected savings.

The full footnote says,

Average annual per household savings based on a 2018 national survey by State Farm of new policyholders who reported savings by switching to State Farm.

In the below screenshot, see the footnote in a variation of this ad:

This version, which I pulled from their site, adds in a third qualifier to the effect size, “up to” (!) By doing this, they can present an even higher dollar amount – even further from the expected savings the average consumer can experience.

Honestly, I’m not even sure how to interpret this myself: what does “up to an average value of $X” mean in this context?


An Argument Against Hell

There are plenty of reasons not to believe in Hell. The problem of Hell (an instance of the problem of evil) suggests an incompatibility with a benevolent God. But here, I want to make especially clear just how problematic the infinite duration part of belief in eternal punishment is. Many people I know and care for seriously consider the possibility of an eternal Hell, and this sometimes leads to a discussion that I want to lay out loosely in the form of an argument.

For this argument, I suppose the existence of:

  1. a God
  2. who is benevolent
  3. who sends some of us to a place of eternal suffering

The focus of this argument deals with proportional or retributive justice. The proportion of eternal punishment in response to one’s actions in a finite lifetime is difficult to fully grasp, but can be made concrete.

When the religious assert the possibilities of Heaven or Hell as a reasonable reward or punishment coming from a just and benevolent god, they seem to feel that this is somehow just or proportionate because of free will. That is, they believe an infinite punishment could possibly be a proportionate and reasonable response to a finite set of choices made in life, because one could have done otherwise, and those choices were sufficiently egregious. If a retributive response of infinite magnitude to a finite number of acts might not immediately seem terribly unjust, then at least it should by the end of this post.

But first, note here that I’m leaving aside the separate problem of retributive punishment as an outdated, Iron-Age justice system, as well as the problem of free will – each of which on its own is sufficient to show irreconcilable conflict with the initial three assumptions. Here, the focus is on proportionality of punishment alone.

In order to show a just, proportional punishment, we want to be able to balance eternal punishment against all of life’s choices. However, it’s difficult to imagine the complete set of any person’s decisions at once in an entire lifetime; this is too abstract. Easier would be to instead examine a single choice and its consequences in isolation; that is, we want to know the amount of punishment that would be doled out for one specific action. By analogy, if a judge were to sentence a criminal to prison for X years, we would want to know which charges the criminal were guilty of, and how many years each charge merited. But, if Hell is the eternal consequence of a sinful life, what can we say about the consequences assigned to any single sin? How much suffering was earned by any single choice? Let’s make this concrete now.

Consider Sam the sinner, bound for hell. Given that Sam will be spending all eternity suffering (or deprived of the goodness of God or Heaven, as some reframe Hell), Sam will be experiencing either infinite suffering, or an infinite deprivation of good. In either case, there is a relative punishment with infinite duration.

Here, we want to take each individual act that led Sam to his eternal punishment, and attempt to say which proportion of each act is responsible for the magnitude of his outcome, infinite suffering. Now, here’s the key: it turns out that we need not know the degree to which each act is responsible for Sam’s sentence to Hell, since the prison sentence in this case is infinite. Each action in Sam’s life can be mapped onto a fractional portion of his eternity in hell – and regardless of the fraction’s size, each portion’s duration will still be infinite. This follows because infinity, multiplied by any fraction, is still infinity. In other words, you could take any one of Sam’s actions in life – no matter how small – and assign a proportional part of his eternal sentence as the punishment for that single action in isolation – and regardless of the unknown size of that action’s contribution to his fate, its corresponding fraction of infinity will be infinite. When the single action is taken in isolation, the absurdity of infinite punishment for any amount of wrongdoing is clearly an infinitely disproportionate form of retributive justice.

For instance: suppose one of Sam’s sins was cheating in the game of Monopoly. What portion of Sam’s eternity gets assigned to this single mistake? At the Gates of Hell, the Devil breaks down each sinful act and its corresponding punishment. The Devil says to Sam: “…and for the time you cheated in Monopoly, you will serve an infinite sentence.” No matter how many ways you divide infinity (no matter how many sins), an eternity of time can be assigned to each one of these offenses.

The lack of more immediate clarity on this problem of infinite disproportionality arises from a lifetime’s choices being the cause of one’s eternal destiny: it’s unclear how much each choice is responsible for the ultimate Heaven vs. Hell outcome. I tried showing that when measuring the proportionality of punishment for each sin, we need not know specifically how much each sin is responsible for, due to the properties of infinity. Once the most minor, single offense can be assigned an infinite duration of punishment, the problem of disproportionality becomes more clear.


Amazon’s Fake Review Problem

Update – the fake reviews featured below were removed shortly after this post blew up on Hacker News.

One reason we buy from Amazon: plenty of reviews. But what if many of Amazon’s top-reviewed items have fake, paid reviews?

I was looking for a sunrise alarm clock this morning and started searching through the many reviews, filtering by ones that mentioned “minutes,” since I wanted to learn about the product’s timer feature. This surfaced a bunch of similar-looking reviews:

Here, we see both the top and bottom review with the sentence,

The light can be pretty bright, you can adjust it where it’ll be dim and slowly brighten 30 minutes before the alarm time.

Did “Becky” and “Dione Milton” really both happen to write a review with the exact same 23-word sentence? Or, is it more likely that they are agents sourcing reviews from a script, and they sloppily pasted their reviews without rewriting them (as they were presumably instructed to do)? Note also the post dates: December 12, 2017. “Becky” and “Dione Milton” both had private profiles, where their 5-6 reviews were hidden – very similar looking.

Amazon – who has some of the world’s most advanced ML – really needs to step up its review fraud detection game. Imagine how great the Amazon shopping experience would be if we could trust its reviews.

Third party meta review sites like Fakespot will identify problems for us (in this case, the product got an “F” grade) – so why doesn’t Amazon?

Amazon: you can do better.

Update 2020-09-23: you might want to also watch “why Amazon has a fake review problem


Garbage sites on Flippa

I subscribed to announcements of websites for sale on Flippa. Over the past several months, I’ve noticed an interesting pattern.

Here is a notification that arrived in my inbox today:

questionable auctions
Flippa is filled with auctions for websites promising easy revenue. Above are two sites by the same seller.

The email summarized a couple listings for websites being auctioned. Both of the above listings make promises in the headlines for 100% “automated” and “work-free” sites. If the owner had such successful, automated sites, why would he be motivated to sell them? This should at least raise suspicion.

What about the traffic statistics and financials? below is a screenshot of what the author claimed:

Claimed earnings
Claimed earnings vs traffic

Note the spike in traffic for the past month. 0 to 15k visits in a month! What a massive upward trend, right? But wait, it seems that the traffic has extremely little correlation to ad revenue. We see a claimed steadily increasing monthly ad revenue report, which should normally strongly correlate with traffic. One explanation is that the owner just installed Google Analytics last month, and traffic stats started accumulating then. Another explanation is that the author somehow spiked the traffic stats for a month.

Suspicious of the source of this traffic, I checked Alexa, a web traffic analytics company. According to, the vast majority of traffic comes from India. Why might that be?

Click for full-sized image
Inflated traffic from India

One easy way to inflate traffic stats is to artificially and temporarily increase them by hiring cheap labor and/or automated “bots” (computers) to visit the site. Since the same seller had listed each site, was it any surprise that both sites had the vast majority coming from India? This proves nothing, but is certainly alarming, since there’s no particular reason to expect the vast majority of traffic to come from India for sites like these.

The seller claims in the listing title that the site is 100% “automated.” If this is the case, then how is the copy being written? Or, are articles just scraped (copied verbatim) from other sites? One way to see if the content of the website was copied is to search a string of text that should be unique to the site. In this case, a Google search reveals that the copy for the test string I chose is found on over one-thousand other sites. This suggests that the copy for this site is most likely scraped from other sites. Google is known to strongly penalize sites for this behavior.

flippa scams
over a thousand other sites contained this exact string of text

We can take a guess, with pretty high confidence, that the site for sale is not the original creator of the “automated” content.

Once again, the auction listing title states the site is “100% automated” and requires “0 work.” Now, read the disclaimer included at the bottom of the auction:

This is great website for anyone interested in Profit making cash cow  with Health concept I like to be completely transparent in my transactions and don’t like to mislead anyone. What you see is what you get. It does involve effort and money won’t come from no nothing. The key to getting this off the ground is auto content and unique

So, in other words, the “key” to success is not leaving the site on autopilot, and “unique” [sic] – unique content, that is. As demonstrated, this site is far from “unique,” having articles that appear on a thousand other sites.

This type of listing is extremely common on Flippa. I’ve noticed this pattern over the past several months. I’m not the first to observe that the site is rife with scams, though. Feel free to learn more about the scams on Flippa.

On a final note, not everything sold on Flippa is junk – just the majority! Buyer beware!


Prediction, Day Trading, and Confirmation Bias

Why do most day traders persist, despite a lack of success?

I’ve listened to several day traders speak at length about their progress, and heard a common thread: in explaining their slow progress, they speak of the difficulty of mastering their emotions. The traders don’t know each other, but they follow the same approach to learning the discipline: they trade real stocks with real money in real time as they hone their skills.

These traders correctly identify that regardless of the overall accuracy in their trading strategies, they will have ups and downs. However, when explaining long-term failures, they continue to cite a lack of mastery of emotions, without suggesting the possibility of the alternative explanation: a strategy that just doesn’t work. Although these traders are not following a precise algorithm, they could be, if only they were able to define their trading strategy with sufficient precision. The “emotions” factor could then be taken out of play, and the traders could see whether their algorithms were viable from a back testing approach: they would apply their algorithms to a large sample of past situations to see how their portfolios would have performed with fear and greed outside of the picture.

It’s not true that favorable back testing guarantees positive future performance, but it is highly probable that in day trading, back testing that yields negative results implies that the algorithm would not perform well in the future. Yet no time is spent on initial back testing to weed out poor strategies; instead, these strategies are first tested in real time over a span of costly years.

If these traders really wanted to see whether discipline was at the heart of the problem, they could still do it – but they don’t, because it is both emotionally and mathematically challenging to embark on an attempt to disconfirm their hypotheses. However, if only they sought to disconfirm hypotheses from the start, before they became so invested in them, then alternate hypotheses showing more promise could have been tested over the years.

In day trading, periods of success are overly attributed to evidence that the strategy works, and periods of failure are attributed to failure in application of the strategy. The strategy itself is kept insulated from criticism. And because of the difficult in separating the signal from the noise, the illusion easily persists.


Last Cookie Hypothesis

the last cookie
a familiar scene

How often have you walked past the tray of cookies at your office and noticed that there is only one cookie left? My guess is a strangely disproportionate number of times. You could substitute “cookie” for any tray of baked goods in general (brownies, muffins, etc) in this observation.

So, what’s the deal? Why is there so often precisely one cookie left?

I’ve asked others. They usually claim an altruistic reason like “not being greedy.” I call shenanigans!

So, why then was there only one cookie left with? On some level (conscious or not), the others considered the possibility that there is something wrong with this cookie, given that it is still there after all this time. Let’s face it: of all the cookies to have been eaten already, this one was not chosen 20 times. It is also guaranteed to have been sitting there the longest possible time of all the cookies.

One more motivation: nobody wants to clean up the cookie tray. Taking that last cookie leads to some sense of responsibility for cleaning up the mess.

Now you can sleep at night, understanding why it seems there’s always one last cookie sitting on a plate at the office.