That will depends as long your text is..
First of all parse all html and extract only the text.
If it is long you can use a cheap method by looking only to stopwords. Get a list of stopwords for each language and figure out how many of them is into your text. You can get a nice list of stopwords in NLTK corpus(python) and take advantage of some good functions to tokenize sentences and words.
import nltk
ENGLISH_STOPWORDS = set(nltk.corpus.stopwords.words('english'))
NON_ENGLISH_STOPWORDS = set(nltk.corpus.stopwords.words()) - ENGLISH_STOPWORDS
STOPWORDS_DICT = {lang: set(nltk.corpus.stopwords.words(lang)) for lang in
nltk.corpus.stopwords.fileids()}
def get_language(text):
words = set(nltk.wordpunct_tokenize(text.lower()))
return max(((lang, len(words & stopwords)) for lang, stopwords in STOPWORDS_DICT.items()),
key = lambda x: x[1])[0]
lang = get_language('This is my test text')
More explanation on http://www.algorithm.co.il/blogs/programming/python/cheap-language-detection-nltk/
If you want to go through python+nltk don't forget to download nltk corpus after installing.
import nltk
nltk.download()