Yes just recently, I moved to London and even though it was a tough choice, it wasn’t a hard one.
Come to England they said. It’s summer they said.
Yes just recently, I moved to London and even though it was a tough choice, it wasn’t a hard one.
Come to England they said. It’s summer they said.
A little confession, I am not good at playing games. I am not even trying to pay attention to the strategy. But I love games and I know I am good at something… the logic to automatically play games. So I always try to put everything into program so I can do other things.
Just recently, I have made a simple python program to auto play Scrabble-like game at Wordsquared.com. A smart bot, until it went too far. Reaching level 100 in less than 24 hours and of course, it didn’t go unnoticed.
(My last Wordsquared “island”)
(Another “island” of mine that looks like North America)
(One of the first area that I created)
(Oh well, I’m a Veteran!)
Just like my old cheating robot, I got a “love letter” from the CEO of MassivelyFun (the game company). He told me that they don’t exclude bots for playing the game, but they do exclude the bots from leader boards. They also reserve the right to rate-limit the play if the bots disrupt other (human) players.
Wordsquared is great game and I like it. I actually enjoy playing it (manually). And if you like Scrabble-like game, I’d strongly recommend you to try it.
No, I will not publish the program.
(A graph of the pressure wave produced by a tuning fork is a sine wave/The Open University)
In the process of making a digital music track, producers might involve sampling as a natural part of the creative process. Sampling is incorporating another sound recoding into our new track. The creative act of sample is nothing new and may create a legal headache since this constitutes copyright infringement:
If the original artist isn’t credited or they object the act of sampling, their moral rights may be infringed.
Contrary to popular myth, samples aren’t billed on a per-second basis like some phone calls — nor are they free when under three seconds long. The overall impact of the sample, together with all relevant commercial factors, means that each sample is evaluated on a case-by-case basis.
A major artist may be able to charge tremendous price for the right to sample their work. They’ll probably expect an advance payment before the derivative work sold. An opportunistic publisher also may demand their cuts. The Verve learned this lesson the hard way, they did not cash in on the record sale success as much as they would have liked. Most of the royalties went to the Rolling Stones’ former manager Allen Klein.
Sampling clearance can be easy but might be a time-consuming process, especially if the rights holders are based overseas or where the track sampled has itself sampled another work. All original samples need to be cleared.
Apparently there’s a workaround. An artist can recreate something that sounds just like the original. Yes, it’s perfectly legal and there are companies like Replay Heaven and Scorccio, that could be employed for sample recreation.
On sample recreation, the keyword is copyright. In the music, there are copyright in mechanical recording (the audio) and copyright in the composition (or publishing, e.g. written lyrics or music of the song). With sample recreation/replays, the mechanical copyright is not infringed upon, because the sample has been “covered” by someone else, made brand new and usually long after the original artist first record it.
But the publishing copyright remains to be cleared – because the composition of written lyrics and/or music usually remains unchanged. Ultimately, according to UK, EU & U.S. music law, a sample replay is classified as being in the exact same category as a “cover version”. Despite the fact that it may sound like a virtual “clone” of an original work – as long as it is a brand new recording, then a “cover version” is exactly what a sample replay is.
It’s a simple case of approaching the publishing company for the song, and obtaining their approval. You could also apply on-line, via clearance company websites. 99.9% of the time, publishers are very happy to see their repertoire covered – that is exactly what their service is about and how they make a profit for themselves and their respective writers.
The amount they decide for the % of publishing depends upon how much of their “original work” (as they call it) has been used in the “new work” (i.e. your new track). If your 4 minute track includes 3 minutes of their “work”, then the % is likely to be very much in their favour.
Or we can just go underground and expect the record will not be a hit.
Singh, V. K., Mukherjee, M., Mehta, et al. “Opinion Mining from Weblogs and Its Relevance for Socio-political Research“, 2012.
Bollen, J., Pepe, A., & Mao, H. “Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena, 2010.
Apparently Einstein kicked Newton’s butt, not once but twice. Special Theory of Relativity (in 1905) showed that Newton’s Three Laws of Motion were only approximately correct, breaking down when velocities approached that of light. And General Theory of Relativity (in 1915, or early 1916?) showed that Newton’s Law of Gravitation was also only approximately correct, breaking down when gravitation becomes very strong.
In Newton’s second law of motion, an object’s mass is measured by seeing how much it resists a change in motion (its inertia). In Newton’s law of gravity, an object’s mass is determined by measuring how much gravity force it feels. The fact that the two masses are the same is why Galileo found that all things will fall with the same acceleration.
Einstein, who won the Nobel Prize in Physics in 1921, didn’t believe gravity was a force at all; he said it was a distortion in the shape of space-time, otherwise known as “the fourth dimension”.
From the description of the book Relativity: The Special and the General Theory:
General relativity or the general theory of relativity is the geometric theory of gravitation published by Albert Einstein in 1915. It is the current description of gravitation in modern physics. General relativity generalises special relativity and Newton’s law of universal gravitation, providing a unified description of gravity as a geometric property of space and time, or spacetime. In particular, the curvature of spacetime is directly related to the four-momentum (mass-energy and linear momentum) of whatever matter and radiation are present.
The relation is specified by the Einstein field equations, a system of partial differential equations. Einstein’s theory has important astrophysical implications. For example, it implies the existence of black holes-regions of space in which space and time are distorted in such a way that nothing, not even light, can escape-as an end-state for massive stars. There is evidence that such stellar black holes as well as more massive varieties of black hole are responsible for the intense radiation emitted by certain types of astronomical objects such as active galactic nuclei or microquasars.
Lawrence M. Krauss in his book, A Universe From Nothing: Why There Is Something Rather Than Nothing:
Einstein’s theory is not just a new theory of gravity, it was also the first theory that could explain not merely how objects move through the universe, but also how the universe itself might evolve.
On Astronomy 162, taught in The University of Tennessee, Knoxville:
Einstein’s Special Theory of Relativity is valid for systems that are not accelerating. Since from Newton’s second law an acceleration implies a force, special relativity is valid only when no forces act. Thus, it cannot be used generally when there is a gravitational field present.
The General Theory of Relativity was Einstein’s stupendous effort to remove the restriction on Special Relativity that no accelerations (and therefore no forces) be present, so that he could apply his ideas to the gravitational force. It is a measure of the difficulty of the problem that it took even the great Einstein approximately 10 years to fully understand how to do this. Thus, the General Theory of Relativity is a new theory of gravitation proposed in place of Newtonian gravitation.
From How Stuff Works:
Basic physics states that if there are no external forces at work, an object will always travel in the straightest possible line. Accordingly, without an external force, two objects travelling along parallel paths will always remain parallel. They will never meet.
But the fact is, they do meet. Particles that start off on parallel paths sometimes end up colliding. Newton’s theory says this can occur because of gravity, a force attracting those objects to one another or to a single, third object. Einstein also says this occurs due to gravity — but in his theory, gravity is not a force. It’s a curve in space-time.
According to Einstein, those objects are still travelling along the straightest possible line, but due to a distortion in space-time, the straightest possible line is now along a spherical path. So two objects that were moving along a flat plane are now moving along a spherical plane. And two straight paths along that sphere end in a single point.
Still more-recent theories of gravity express the phenomenon in terms of particles and waves. One view states that particles called gravitons cause objects to be attracted to one another. Gravitons have never actually been observed, though. And neither have gravitational waves, sometimes called gravitational radiation, which supposedly are generated when an object is accelerated by an external force.
Gravitons or no gravitons, we know that what goes up must come down. Perhaps someday, we’ll know exactly why. But until then, we can be satisfied just knowing that planet Earth won’t go hurdling into the sun anytime soon. Gravity is keeping it safely in orbit.
Einstein proposed in his General Relativity theory that what is called gravity is really the result of curved spacetime.
Einstein described gravity as a warping of spacetime around a massive object. The stronger the gravity, the more spacetime is warped.
Light travels along the curved space taking the shortest path between two points. Therefore, light is deflected toward a massive object! The stronger the local gravity is, the greater the light path is bent.
Einstein’s theory is not perfect (no scientific theory is absolutely perfect), but it does give a better understanding of the universe.
Two key predictions of Albert Einstein’s general theory of relativity have been confirmed by NASA’s Gravity Probe B mission.
(An artist’s concept of Gravity Probe B orbiting Earth, which is warping spacetime/NASA)
The first is the geodetic effect, which is the warping of space and time—or spacetime—around a gravitational body, such as a planet.
The second effect of gravity tested by Gravity Probe B is frame dragging, which is the amount that a spinning object pulls the fabric of spacetime along with it.
OOT: I didn’t know that we have so many dimensions (also known as hyperspace), we have (at least) seven dimensions, including one for time! Unfortunately we cannot fully envision them. I’ll get back to this in future blog posts.
Interesting editorial from the FierceHealthIT on NLP in Healthcare.
In a recent post on his Disease Management Care Blog, Jaan Sidorov speculated that natural language processing (NLP) might be used to pick up missing diagnoses from free text and perhaps even predict problems before physicians spot them.
Sidorov also cited a study that found the use of an NLP program to scan free text in encounter records was nearly as accurate as lab tests in showing whether patients had the flu. While the study focused on using NLP in biosurveillance to spot disease outbreaks early, he was intrigued by the possibility of employing such a system to detect diseases that physicians had not yet diagnosed.
The hope is that NLP will eventually be able to parse medical terms in free text to speed up data entry. The idea of using NLP for predictive modelling and alerts, meanwhile, will continue to gain traction as researchers discover new ways to apply the growing power and speed of computers to medicine. Of course, computerised insights will never replace the intuition and knowledge of a skilled, experienced physician. But it would be nice if he or she had that extra edge.
NLP has been used for some time to extract medical terms and also medical problems from electronic clinical documents. Hopes have been raised, since IBM Watson won Jeopardy, that NLP will be able to aid doctors in clinical documentation. What IBM does think Watson is good for data analysis and aiding decision. The technology has the ability to scan and analyse data from far more sources than a human ever could in a short period of time, potentially aiding doctors in diagnosing complex but urgent conditions.
This talk provides an in-depth treatment of satellite telephony networks from a security perspective. The overall system seems secure, but in reality, it cannot be expected to be fully reliable.
We will briefly cover the satellite mobile system architecture, then discuss GMR (GEO-Mobile Radio) system elements, e.g. GSS (Gateway Station Subsystem), MES (Mobile Earth Station), AOC (Advanced Operation Center), and TCS (Traffic Control Subsystem) for GMR-1 systems and NCC (Network Control Center), GW (Gateway), SCF (Satellite Control Facility) and CMIS (Customer Management Information System) for GMR-2 systems.
From there, we will discuss the security issues of GMR system as it shares similar vulnerabilities with GSM–GMR is derived from the terrestrial digital cellular standard GSM and support access to GSM core networks, along with some interesting demos.
Since the mid 1950s, satellite systems have made enormous advances in capability and performance. Satellite communication services have become integral to our society. Unfortunately, security has not kept pace where the current systems are vulnerable to a variety of attacks.
This is the latest update of Hacking a Bird in the Sky series and will discuss about important security issues on ATM (Automated Teller Machine) and private banking networks using satellite networks (based on real-life cases), also the possible evasion techniques for channel estimation and single stage multiuser detector for asynchronous satellite communications for detecting rogue users will be presented.
Kerokan (rubbing or coining) is scraping the back ribs with the edge of a coin or spoon to bring up long red welts. Kerokan looks primitive and cruel to Westerners and nobody know exactly since when kerokan be used as a therapy, but they believe that kerokan therapy exist since kingdoms time. Most, if not all, Indonesians know about the treatment of kerokan, including they who never had experienced it, and they swear it’s an effective cure for masuk angin or equivalent in English as “catching a cold”.
(Kerokan by Setiyo)
Pain diversion. Perhaps that is the simple way to explain kerokan.
The feeling of pain was triggered because the damaged cells release prostaglandins, and this substance stimulates the pain nerve receptors. Inharmonious contraction can occur in the muscles of the peripheral vasculature. During kerokan, blood vessels can be broken and made skin seems so reddish. Stimuli from kerokan can ‘divert’ attention that causes muscular discoordination being in tune again. But according to some doctors, to overcome the pain caused by muscular discoordination is enough by drinking antiprostaglandins, paracetamol, or aspirin.
Update: Apparently kerokan is not only popular in Indonesia. It’s known as “cai giodi” in Vietnam, “gah kyol” in Cambodia, or “gua sua” in China.
This post is about Indonesian government effort to filter the Internet, but the filtering issues applied everywhere. Personal apology for my typos, grammatical errors, and misspellings. Staying awake for more than 24 hours sucks.
Few days ago Indonesian Communication and Information Minister, Tifatul Sembiring, threatened to shut down Internet access on BlackBerry in Indonesia unless access to porn site was blocked on the mobile phone devices. His goal is to block websites the government deems pornographic as mandated to do so by the 2008 Anti-Pornography Law.
In 2008, triggered by Fitna, a short political film by Geert Wilders with his view on the religion of the Islam, the previous Communication and Information Minister, M. Nuh, was also trying to filter the Internet in order to limit the accessibility of that video as requested by President. And it was failed.
Here, I don’t want to discuss about the reason since I’m not a religious person and I don’t really care about porn on the Internet. What’s important to me is how they (the government) do it.
Based on my observation, the way Indonesian government filtering the content of the Internet is by involving ISPs and ask (perhaps the correct term is force) them to block the websites that fail to meet government rules at ISPs and customers cost. For now, porn seems to be the priority.
According to an article in JakartaGlobe, the government has a very big list, 90% from estimated four million of pornographic websites (I’m wondering where the list come from, do they browse porn regularly?). Also in that article, Tifatul said that the government has their own their own algorithm analysis (and mechanism?) to do content filtering but it is up to the ISPs to implement the filtering method of their choice. Knowing how most of Indonesian ISPs are configured, this is a recipe for disaster.
Any network engineer knows that content filtering is expensive. Quality content filtering is about the same and often times a challenge to manage and maintain. The cost of deploying a filtering mechanism depends on the complexity of the hardware required to implement it. Specialised Internet content filtering equipment is expensive that the general purpose one. I don’t think ISPs are willing to spend extra for specialised equipment.
So now, many of Indonesian ISPs finally implement filtering on Internet traffic. Most of them go with the easiest and the cheapest solution: DNS filtering. Then there are few ISPs implement filtering on their HTTP proxy server. Actually this approach gives the greatest flexibility, allowing blocking both by full website URL and by the webpage content. Those are common approaches, but there’s another one that nearly brought 2008′s government filtering effort to a national disaster: IP filtering.
All those mechanisms suffer from the possibility of errors that may be of two kinds: false positives (where websites that were not intended to be blocked are inaccessible) and false negatives (where websites are accessible despite the intention that they be blocked). The trade-off between false positives and false negatives is a pervasive issue in computer security engineering.
IP address filtering is comparatively crude and must block an entire IP address or address range, which may host multiple websites and other services. This clearly will burn down the barn in order to kill the mice. DNS tampering (filtering) is slightly better since it will allow individual website’s domain to be blocked even when the site hosted on a machine shared with thousands of other sites.
Although the mechanisms discussed here will block access to prohibited websites, the filtering may be circumvented. However, the effort and skills required vary.
DNS filtering is relatively easy to bypass by the user selecting an alternative domain name resolver. This type of circumvention may be made more difficult by blocking access to external DNS servers (as usually happen in corporate or GPRS networks). IP filtering and HTTP proxy filtering may all be fooled by redirecting through another proxy server. If an ISP uses transparent HTTP proxy, users can use SOCKS proxy or use Tor as alternative.
Even where users are not attempting to circumvent the system, they may still be able to access the prohibited resources. They can do denial-of-service and social engineering attacks. By doing denial-of-service attacks, the system will be overloaded and trivially will drop the filtering rules. And remember, implementing the content filtering will actually make things go slower.
Personally, I’m not afraid of Internet content filtering. I usually find a way to circumvent the situation and get back what is rightfully mine, freedom to access the Internet. But I’m little bit worry because the content filtering infrastructures are now vulnerable for insertion of false information, not only by hackers but also from the government.
At the end, average Internet users might find content filtering is annoying and ineffective on achieving the real goal (if it’s just blocking porn). Now, I let you judge the current situation about RIM vs Indonesian government based on this Q&A with Gatot Dewabroto, Kominfo spokeperson. (sent to Kampung Gajah mailing list by @askvicong).
@mayaluna: “Apakah dengan pemblokiran pornografi di BB, pemerintah dapat menjamin tingkat akses pornografi di Indonesia turun?” [With pornographic blocking in BB (BlackBerry devices), can government assure the access to pornographic content will be low?]
Gatot: “Tidak bisa dibilang begitu, pornografi adalah celah hukum yang kita gunakan biar kita bisa masuk ke RIM ini … ” [Can't say that (no guarantee?), (but basically) pornography is a legal loophole we use to "interfere" RIM]
Oh, well.. Despite of being misinformed (by his subordinates, as many people guessed) or clueless on how Internet filtering works, I’d recommend Tifatul to read this book: Access Denied: The Practice and Policy of Global Internet Filtering (Information Revolution and Global Politics).
Earlier today when I wrote antiskrembel code, I realised that I did a lot sorting a big list of words. Then I was wondering whether I can optimise the sorting part. So I searched and tested some sorting algorithms that applicable in Python since I use that language for my code.
By definition, a sorting algorithm is an algorithm that puts elements of a list in a certain order. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement.
These are some sorting algoritms I have tested: adaptive merge sort (which actually built into Python since version 2.3), bubble sort, heapsort, insertion sort, merge sort, quicksort and selection sort.
Since I’m lack of better explanation, I borrowed some from Wikipedia for the following sorting algoritms. I ordered the worst to the best one based on my naive test against completely random shuffle array.
Bubble sort is a straightforward and simplistic method of sorting data that is used in computer science education. Bubble sort is based on the principle that an air bubble in water will rise displacing all the heavier water molecules in its way.
The algorithm starts at the beginning of the data set. It compares the first two elements, and if the first is greater than the second, then it swaps them. It continues doing this for each pair of adjacent elements to the end of the data set. It then starts again with the first two elements, repeating until no swaps have occurred on the last pass.
def bubble_sort(list): for i in range(0, len(list) - 1): swap_test = False for j in range(0, len(list) - i - 1): if list[j] > list[j + 1]: list[j], list[j + 1] = list[j + 1], list[j] # swap swap_test = True if swap_test == False: break return list
Bubble sort is efficiently used on a list that is already sorted, except for a very small number of elements. For example, if only one element is not in order, bubble sort will only take 2n time. If two elements are not in order, bubble sort will only take at most 3n time.
Insertion sort is a simple sorting algorithm that is relatively efficient for small lists and mostly-sorted lists, and often is used as part of more sophisticated algorithms.
def insertion_sort(list): for i in range(1, len(list)): save = list[i] j = i while j > 0 and list[j - 1] > save: list[j] = list[j - 1] j -= 1 list[j] = save return list
It works by taking elements from the list one by one and inserting them in their correct position into a new sorted list. In arrays, the new list and the remaining elements can share the array’s space, but insertion is expensive, requiring shifting all following elements over by one.
Selection sort is noted for its simplicity, and also has performance advantages over more complicated algorithms in certain situations.
def selection_sort(list): for i in range(0, len (list)): min = i for j in range(i + 1, len(list)): if list[j] < list[min]: min = j list[i], list[min] = list[min], list[i] # swap return list
The algorithm finds the minimum value, swaps it with the value in the first position, and repeats these steps for the remainder of the list. It does no more than n swaps, and thus is useful where swapping is very expensive.
Heapsort is a much more efficient version of selection sort. Heapsort is an in-place algorithm, but is not a stable sort.
def heap_sort(list): first = 0 last = len(list) - 1 create_heap(list, first, last) for i in range(last, first, -1): list[i], list[first] = list[first], list[i] # swap establish_heap_property (list, first, i - 1) return list def create_heap(list, first, last): i = last/2 while i >= first: establish_heap_property(list, i, last) i -= 1 def establish_heap_property(list, first, last): while 2 * first + 1 <= last: k = 2 * first + 1 if k < last and list[k] < list[k + 1]: k += 1 if list[first] >= list[k]: break list[first], list[k] = list[k], list[first] # swap first = k
It also works by determining the largest (or smallest) element of the list, placing that at the end (or beginning) of the list, then continuing with the rest of the list, but accomplishes this task efficiently by using a data structure called a heap, a special type of binary tree. Once the data list has been made into a heap, the root node is guaranteed to be the largest (or smallest) element. When it is removed and placed at the end of the list, the heap is rearranged so the largest element remaining moves to the root.
Merge sort implementations produce a stable sort, meaning that the implementation preserves the input order of equal elements in the sorted output. It is a divide and conquer algorithm.
def merge_sort(list): merge_sort_r(list, 0, len(list)-1) return list def merge_sort_r(list, first, last): if first < last: sred = (first + last)/2 merge_sort_r(list, first, sred) merge_sort_r(list, sred + 1, last) merge(list, first, last, sred) def merge(list, first, last, sred): helper_list =  i = first j = sred + 1 while i <= sred and j <= last: if list[i] <= list[j]: helper_list.append(list[i]) i += 1 else: helper_list.append(list[j]) j += 1 while i <= sred: helper_list.append(list[i]) i +=1 while j <= last: helper_list.append(list[j]) j += 1 for k in range(0, last - first + 1): list[first + k] = helper_list[k]
It starts by comparing every two elements (i.e., 1 with 2, then 3 with 4…) and swapping them if the first should come after the second. It then merges each of the resulting lists of two into lists of four, then merges those lists of four, and so on; until at last two lists are merged into the final sorted list. Merge sort has seen a relatively recent surge in popularity for practical implementations.
Quicksort (also known as “partition-exchange sort”) is a comparison sort and, in efficient implementations, is not a stable sort. Quicksort relies on a partition operation: to partition an array, we choose an element, called a pivot, move all smaller elements before the pivot, and move all greater elements after it then recursively sort the lesser and greater sublists.
Efficient implementations of quicksort (with in-place partitioning) are typically unstable sorts and somewhat complex, but are among the fastest sorting algorithms in practice.
Here are some implementations and explatations of quicksort algorithm I borrowed from Literate Programs).
The most straightforward way to implement quicksort in Python is to use list comprehensions. This approach generates two lists, one of elements greater than or equal to the “pivot” element (in this case the first element of the list), and one of elements less than the pivot. These two lists are then recursively sorted, before being concatenated around the pivot to form the final sorted list.
# http://en.literateprograms.org/Quicksort_(Python) def qsort1(list): """ Quicksort using list comprehensions """ if list == : return  else: pivot = list lesser = qsort1([x for x in list[1:] if x < pivot]) greater = qsort1([x for x in list[1:] if x >= pivot]) return lesser + [pivot] + greater
This implementation has the advantage of being clear and easy to understand, yet quite compact. As it turns out, it is also faster than other list-based quicksorts in Python.
Quicksort requires excessive time and spaces resources on data that is already sorted or nearly-sorted. One way to improve the robustness of the quicksort in the face of this kind of data is to randomly select a pivot, instead of just making the pivot the head of the list.
from random import randrange def qsort1a(list): """ Quicksort using list comprehensions and randomized pivot """ def qsort(list): if list == : return  else: pivot = list.pop(randrange(len(list))) lesser = qsort([l for l in list if l < pivot]) greater = qsort([l for l in list if l >= pivot]) return lesser + [pivot] + greater return qsort(list[:])
An alternative to using list comprehensions is to partition the list in a single pass using a partitioning function that accumulates list items that are equal to the pivot, as well as those that are lesser and greater than the pivot. This also eliminates the need to process more than once any items that are equal to the pivot. In theory, this should result in a faster sort than the list comprehension version. However in practice it appears that the list comprehension implementation is not only faster than a single-pass partitioning approach, but approaches the performance of an in-place sort.
def qsort2(list): """ Quicksort using a partitioning function """ if list == : return  else: pivot = list lesser, equal, greater = partition(list[1:], , [pivot], ) return qsort2(lesser) + equal + qsort2(greater) def partition(list, l, e, g): while list != : head = list.pop(0) if head < e: l = [head] + l elif head > e: g = [head] + g else: e = [head] + e return (l, e, g)
Early versions of Python used a hybrid of samplesort (a variant of quicksort with large sample size) and binary insertion sort as the built-in sorting algorithm. This proved to be somewhat unstable. So, from python 2.3 onward uses adaptive mergesort algorithm.
def adaptive_merge_sort(list): list.sort() return list
$ python test-sort.py 50000 # elements: 50000 adaptive_merge_sort - 0.051 secs - passed bubble_sort - 805.074 secs - passed heap_sort - 1.088 secs - passed insertion_sort - 380.116 secs - passed merge_sort - 1.006 secs - passed qsort1 - 0.469 secs - passed qsort1a - 0.634 secs - passed qsort2 - 28.783 secs - passed selection_sort - 344.880 secs - passed
Times (in seconds) were taken on my first generation MacBook Pro (2.16 GhZ Intel Core Duo) under Mac OS X using the Python 2.7 package from MacPorts.
|Sorting Algorithms||Number of Elements|
|Adaptive Merge Sort||0.000||0.000||0.001||0.004||0.008||0.051|
using list comprehensions
using list comprehensions
and randomised pivot
using a partitioning function
At the end, I realised that there is no best sorting algorithm for all data. Various algorithms have their own strength and weaknesses. For example some “sense” already sorted or “almost sorted” data sequences and perform faster on such sets.
I know my test is considered naive because it was against completely random shuffle array. There are different types of array that might represent real life situation, such as already sorted array, merged already sorted arrays (chainshaw array), few unique array (consisting of small number of identical elements), sorted in right direction array, sorted in reverse order array, large data sets with normal distribution of keys, and pseudorandom data (for example Google Public Data).
After I twit about Python’s built-in sort() is way faster compared to quicksort(), @hasant mentioned that there is no original or pure function the high-level programming language. Indeed sorting data can be done in any language but I believe C and other compiled languages may provide an opportunity to see the effect of computer instruction set and CPU speed on the sorting performance.
Finally, I have some news from my music production activity.
First, I have an upcoming EP that will be released on Elektrax. Scheduled for release on early February 2011. This EP is actually remastered version of my old materials (previously released on *Cutz) with additional remixes from Simone Barbieri Viale, Andreas Hermansson & David Gunther. Check out the samples at http://www.elektraxmusic.com/elek097/
Second, my remix of DJ Hi-Shock’s famous track Asama Express will be released later on the same month. Check out the samples at http://www.elektraxmusic.com/elek100/
Oh, BTW… Happy New Year!
Earlier this week, Gawker, the company operating numbers of Internet’s most popular blogs, including Gawker, Lifehacker, Gizmodo and Kotaku, was the victim of a security breach. A group of hackers, who call themselves Gnosis, compromised Gawker’s servers and exposed blog source codes and a log of user database containing 1,247,893 accounts. Gnosis was ‘kindly’ sharing their findings to the public and made them downloadable via torrent.
Inspired by the brief analysis done by the guys at Duo Security, I give myself a try to duplicate their work. After doing some sort of sanitation processes against the Gnosis full_db.log file, I found out that the encryption used was DES-based crypt(3) and no matter how long the user password was, it will get trimmed to 8 characters. I also found that 499,351 users has NULL, [none] or just empty in the password field (the second column).
I was using the supreme password cracker, John the Ripper and tested to against the sanitised hashes file. The test was conducted on a single processor Linux machine with (almost) standard configuration. I didn’t even try JtR on parallel environment. In this post, I will focus on the result analysis and will explain my cracking method on different post later. The first test was conducted on midnight 14 December 2010, but I forgot to did it the “right way” so I decided to redo the test on the next night. The second test took approximately 3,5 hours and JtR managed to crack 854,816 passwords (including NULL, [none] and empty passwords).
I also monitored the cracking progress, make record every 5 minutes and created a nice statistic chart using Google Spreadsheet, as you can see below.
The top 20 most common passwords (not including empty passwords) from my cracking results were:
4162 123456 3332 password 1444 12345678 861 lifehack 765 qwerty 529 abc123 503 12345 471 monkey 439 111111 410 consumer 391 letmein 372 1234 331 dragon 322 trustno1 320 gizmodo 319 baseball 311 whatever 305 superman 288 1234567 278 iloveyou
This is what happens when you have too much free time. I’ll be continuing to update this post with more stuff as the results come in.
Update: The (remote) machine used for this password cracking activity went down due to hardware failure. Finally it went up again today and lucky no data lost. At the end, JtR managed to cracked 924,210 passwords.
I did recall the days when I have to work from cafe or restaurant, back then wireless technology was still relatively new and for sure a hip thing. I’ve witnessed how wireless hotspot technologies evolved. It was pricey as the hotspot access usually provided by ISPs but today many cafes and restaurants are giving away the access for free so the customers can stay much longer.
I’ve presented some stuff related to wireless hotspot security at some security conferences. These are my slides presented on Hack in The Box, Kuala Lumpur in 2005 and IT Underground, Prague in 2006.
How about today’s wireless hotspot security? I still see number of risks associated with the protocols and encryption methods, and also in the ignorance of the users. Apparently, there’s not much I can do regarding the last issue.
Yes, this is a very late news… as I discovered it quite late too. But well, late news is still news.
Although the project is over, apparently our fellow Indonesian fans and Ravelex members are believe that SEGO is deserved to be nominated at REDMA 2010. Not in one, but THREE categories.
How to vote and all information related to REDMA 2010 are available at Ravelex website. Hopefully not too late for you to vote, heh?
As for two consecutive years SEGO has been nominated, and I personally would thanks to all fellow friends and fans who have been supporting the project.
That’s all for now, guys.
PS: I’ve updated the Releases page too.