Security check up, is it worthwhile?

I’ve been going over a couple of websites recently checking there security settings and conducting some penetration attempts to find potential vulnerabilities.

Why is it worthwhile?

You may be thinking that your not a bank so what does it matter if someone hacks your website. There is no customer data or money to be stolen. Your not a big site why would anyone bother?

If you stop and think a bit there are several issues you want to avoid.

Reputational Risk

You don’t want someone adding content to your website because it makes you look bad, if there are suddenly links to inappropriate content.
It makes you look unprofessional you don’t want to lose a customer because your website isn’t under control. At the least it makes you look dopey because you have not noticed.

Loss of traffic

If your website hosts in appropriate content your visitors may not reach you, either because your site drops down in search engine results or perhaps visitors to your site will be redirected to someplace else.
I’ve seen this happen only to visitors who followed a link from a google search page so the owner of the website was completely unaware. You don’t tend to search google for your own domain name.
On the internet your competitors are just a click away, the customer doesn’t have to walk down the road to the next store so its very easy to lose customers.

Loss of service

If your website breaks down customers may not be able to access simple things like your telephone number.

Its a waste of your time sorting out the mess.

Your probably quite busy running your business, you don’t want to be tidying up after a break in. It never happens at a convienient time and it takes a while to sort out. Once its happened you can’t really trust the code and stored information on the website how do you know if whats been tainted and what hasn’t.
Do you have a clean copy of the code thats up to date or a recent copy of the database. You will probably end up paying someone to sort it out for you, which may be costly and perhaps not swift.

Your customers may expect it of you.

Several of my customers have been required to conduct security reviews/penetration tests as part of their own customers tick box supplier assessments.

Hackers often aren’t very targeted or concerted, its more a matter of ne’er-do-well consistently wandering around the neighbourhoods trying doors and windows to see if anyone has left theres unlocked.
Mostly automated attempts from bots and automated scripts. Often they are just looking for some place to host there code or expand there network. It doesn’t matter to them much how big or important your site is and the cost to them is so low its worthwhile probing loads of quite minor sites.
Its easy to forget about a site especially if you don’t use it often but worthwhile servicing it occasionally.

Elastic Search

Been trying out Elastic search recently. Used Solr before but not Elastic Search.

The original search was producing spotty results for some terms it was fine for other terms it was a useless, it was a black box so wasn’t particularly configurable, it also took a long time and errored/locked up whilst it was reindexing. In addition the site it was working on is a bit niche and available in several languages and we didn’t have the capacity to tweak the search to match. It sometimes seemed slow.

Elastic search solves some of the issues out of the box, you can setup a separate index for each language and language specific stemmers allow better results to be returned for variants of words
wolf, wolves, wolverine

it also allows configuration of what fields are indexed and how search queries are processed.

  • Fuzzy search
  • Exact matching
  • Multi matching (matching across several fields)
  • Stemming
  • Synonyms
  • Shingles (which I have not tried yet)

It also allows the return of Suggestions based on the terms in the index rather than dictionary terms that may not exist in your search terms.

The results seem to be quite fast we have about 2000 products in 4 different languages so about 8000 records.

Next steps

I’m sure we will tweak the settings over time, giving the titles of products and their descriptions different weights, already identifying some synonyms that need adding starwars = star wars for example and spider-man == spider man == spiderman.

An issue we seem to have at the moment is terms that match many items for example currently if you search ‘Marvel legends’ you get 270 items so if you search ‘Marvel Legends Medusa’ the ‘Medusa’ is the most important part but the code can’t readily tell that.

Whilst ‘Marvel Legends Medusa Exclusive’ might come first you also get all the results featuring ‘Marvel legends’ it might actually be better to search for ‘Medusa’ because that is the distinctive part or key phrase.

Probably worth a trial with a common term to see if the results returned featuring lots of other Marvel legends figures are preferred by customers they searched for Medusa but then go one to buy other figures in series or if returning results featuring just Medusa is more useful.

We could probably show expertise by also showing related figures to Medusa like Black Bolt.

Need to read more about key phrase extraction for identifying the Medusa parts of queries, there is a Key phrase extraction algorithm called KEA perhaps that is the way to go. Lots of fiddling ahead :).

The addictive nature of coding

Coding is addictive, not in a mug a granny to get your next hit kind of way, but in scratch that itch keep scratching compulsive way. Finish sorting out feature A then drift on to adding feature B and then before you know it hours have gone by and your adding feature D. Perhaps because its tinkering if I just change this then that will work better but if I change that again then in that circumstance this can be better …

So the temptation becomes solve this problem with more code or scratch that itch. If you ask a developer the answer to a problem is more code, in the same way as if you ask a lawyer its more law or a spider its more webs, its in their nature.

Computers are for computing stuff, they aren’t an end in themselves, more programs running on more computers won’t necessarily make the world a better place. Just more complicated and with more and more points of interaction. In a more and more divorced from reality world.

How does this affect the real world? or if the code was a black box you were using from the outside would you care about this? Probably in many circumstances no, the end user doesn’t care about speed or how it works at all until it doesn’t or it crawls to what to you seems a halt.

If your in a hole and you keep digging your creating a deeper rabbit hole until it starts filling up with water and you drown.

Could this resolved without code is a question to ask or similarly with less code, libraries and dependencies.
In short first stop and think is the solution more code.

The universal website outline

On you basic website or core you want to say

What do you do

What service, product event are you offering or supporting.

How to Contact You

Best ways to contact you, postal, email, telephone, twitter, facebook, smoke signals etc.

Include Legal Guff, Compliance

Then there is legal compliance, usually in the UK: company / charity name, number, registered office. Usually this is the same as your contact details. You can link to a page on companies house.

Selling things

If your selling stuff on line then an essential would include

  • listing of products
  • ability to add items to a basket
  • something to add delivery details and pay.

That is the essentials really from then on your really providing more about you, mostly that is proof of what you say you do. Testimonials from customers, case studies or examples to show reality matches up with what you said about yourself.

Mostly for visitors the website is offering the answers to Who, What, When, Where and How.

In Summary

Your website doesn’t have to be complicated and perhaps simple is better, at least people won’t have to crawl through pages of marketing blah, blah trying to interpret what it is you actually do. If its so complicated that you can’t explain it succinctly then you probably don’t understand what you do either. If you want to add more proof, answers to questions that you keep getting asked or pictures of your cat you can always do it later.


If your regulated by someone like the FSA or The Charity Commission you will want to add your registration there just for public reassurance if nothing else. The legally minded like to add terms and conditions, then disclaimers about copyright and third party links.

I like it when sites list details about cookies set because it at least shows they are aware of them and gives you an idea of what that request to xyz-network.thailand is doing when the page loads.

You probably want some sort of stats so you can get an idea of who is visiting your site, lots of those exist from basic to advanced, if your on shared hosting, which you probably should be for a simple site, then you probably have some like Webalizer, AWstats, Piwik, Analog that you can tick a box to install. Others like Google Analytics give you some code to paste into your webpages.

Scumbags, Idiots and Charlatans

In the last few weeks experienced a lot of client abuse on the interweb.
Customers paying through the nose for web development and just being ignored.
Customers being abandoned.

I’ve been picking up the pieces for a couple of people and it makes me feel kind of sad its just that there isn’t any need for it. Ok things change and that is fine the web developer is not married to the customer and if you don’t have the skills in house to support them then that is fine but say that.

Not all customers are super technical and some need more hand holding than others many are actually frustrating.

But there isn’t any excuse really for treating people that badly. Building web stuff is a people business your coding stuff that will be used by humans in order to fulfil aims that other humans have. Whilst people are sometimes difficult and irritating and occasionally just downright maddening. Its developers job to help them, not impose stuff on them or extract money for nothing or exploit their lack of knowledge. If you can’t get on say that and leave them. Don’t just ignore them, you could even help them find someone new to sort things out.

Spam text and Markov Chains would that make Turing happy?

I can quite easily using Markov chains some Python and a source text produce quite realistic texts. Turing might even have been pleased in a perverse way his machines are indistinguishable from humans. Many blog commenters seem to be using their second language which is fair enough. But does make it difficult to tell if they are genuine humans. Often the only way I can tell if comments are spam is to look at the links they include. Super cheap *** something is usually a clue.

Just for fun here is a brief auto generated example of text generated from the Henry Fords book ‘My Life and Work’. Markov Ford?

They think that is the business and who do not stand the assumption of mind.

The average person may be filled by foot of fear will respond.

Everything has won his maintenance and also happened to produce also we should prove that bread. If a very carefully tried to greedy for a business could ever really constructive thinking over to share; at a child to the railways and then is the filing of life, manufacture, a tractor came down you see, it must be easy. Success is the country that same with personalities.

To my plant at the limits his territory.

This is pretty crude an still quite easy to tell apart from normal text. With a little more effort I’m sure I could make it much more “human”. I hope most spam is generated by this sort of method I guess the alternative is poor people in foreign countries generating spam for money which is probably worse and doesn’t feature much technology at all so is pretty uninteresting.

Now I wonder if I can produce my own Management Guru book :)

Are coupons just a distraction?

Coupons are a good way of taking things offline a coupon code can be on a card or in the package with a new purchase. You can use them as part of a referral or affiliate marketing scheme. I wonder though are they turn off in checkout? How often do you go to a checkout see a box for a coupon code and then go off looking for a coupon.

If your thinking of coupons being an incentive to go and buy something then the coupon box at checkout is strange. The customer has already decided to buy something they put it in their basket. They saw the price they are happy to pay it and then you are offering a coupon. The coupon they are entering didn’t make them any more likely to buy anything?

Are you making them more likely to wander off at the point in the process when they were just going to enter their credit card details and complete a purchase. Perhaps its better off earlier in the process or perhaps its just me who acts in that way when faced with a coupon code box. I guess I’ll try running a trial and finding out.


I wrote a referral scheme for a client recently. It makes sense, existing customers are a your target customers and they probably know other people like them who could be your new customers. A happy customer recommending your service to their friend, has a lot more weight, than you plugging your own service.
You do need a few customers to start with though.

What referrals are not

Its a not an affiliate scheme, although they are related an referral scheme is about a customer telling another customer about you. At worse an affiliate scheme is some random and potentially non ideal customer putting a link on their site to your site.
They aren’t saying I use these people they are really good, they could be for you, to someone they know. They are just broadcasting an advert with the aim of a financial reward. They probably are not using their knowledge of your service in order to aim at people it is suitable for.
So you may end up with a whole bunch of prospects who are less than ideal or a waste of time.

Do you need to financially incentivise people?

Often this is all thought about as if I give person X £1 for each person they refer then they will refer me lots of people who I can make £100 on. But I’m thinking you don’t buy good will. People want to tell other people how well they are doing and if your product works well to them they will tell their friends. Unless you do something really embarrassing for them (not sure how well referral would work for an STD clinic).
You can encourage your customers to give referrals in other ways as well. Send them flowers, thank them in person buy them bottles of wine, them referring you is personal, so hopefully the reward is personal.

Oh and hopefully deliver a fantastic product or service.

Do Not Track (DNT)

You know those adverts that track you from one site to another so if your on the Northern Territory News (for all your crocodile related news) it shows adverts from other sites you have been on earlier. I sometimes feel stalked by new relic offers for example.

The aim of DNT is to avoid that slightly sinister feeling behavior and avoid advertising and other networks tracking you across sites. This does seem on the verge of being evil and just as you can opt out of marketing emails surely you should be able to opt out of this kind of tracking.

Be nice sometimes if customers were more aware of DNT, see more details here, seems in outline a good idea. But depends on web developers and more complicatedly advertising networks honouring it. At the moment its a bit of a request please opt me out of third party tracking which may or may not fall on deaf ears.

You can disable third party cookies depending on your browser but that may lead to some issues under some sites. For example on Firefox this is how here.

So at the moment I’m redirecting a bunch of sites to a local site on my laptop via the hosts file. Which cuts out a whole bunch of rubbish but its not really for the amateur. This also speeds up some sites for me which are quite slow with big old flash adverts.
Its a bit of a nuclear option though as once you have set it up you can’t see anything from that site even if you wanted to.

So the best solution out there for a normal person is still probably an ad blocking addon for your browser although again its not really solving the issue of tracking, more blocking the result.

Perhaps DNT will get wider adoption as it gets more standardised but i guess because it makes people money they have little incentive, but then that was the same a few years ago for popups and they are now radically reduced.

You can see your browsers DNT preference here. Which also shows instructions on how to set it in some browsers.


Been reading AntiFragile by Nassim Taleb came across a new term to me, Iatrogenics – definition from from wikipedia

Iatrogenesis, or an iatrogenic artifact (pron.: /aɪˌætroʊˈdʒɛnɪk/; “originating from a physician”) is an inadvertent adverse effect or complication resulting from medical treatment or advice, including that of psychologists, therapists, pharmacists, nurses, physicians and dentists. Iatrogenesis is not restricted to conventional medicine; it can also result from complementary and alternative medicine treatments.

It seems a neat idea the costs of action compared to inaction, how many people would be better off left alone.

Taleb uses it in context of a personal doctor who feel they must doctor, copy editors who feel they need to change words in order to justify themselves. What damage does make busy do? perhaps it would be better in many cases to just leave alone. Rather like in engineering if it works leave it well alone, don’t touch it.

Perhaps I should have Iatrogenic days applied to software I just spent a long time sorting out errors in a concrete5 system only to find that the system relied on those errors. Errors cancelled out errors to make system work (argghh).

Whilst this isn’t an argument for always doing nothing perhaps there is a argument for considering the effects any action could have on the whole system. Rather than being focused on the end result you have in mind at that moment and inadvertently causing more harm than good.