May292012

text

## Hashtag twitter network creation source code

Trying to make a new graph for report, but my laptop is crashing likely from lack of RAM.

Posting this here so i can get somebody to run it on their laptop while i finish off the report.

http://pastebin.com/Mwi51C53

please dont abuse my oauth secret tokens!

NOTE: this was based on some work by drew conway but the code was deprecated and rewritten for Twython, OAuth2 updates.

February62012

text

## Ensemble aggreement to improve text classifcation

“We recommend ensemble agreement to enhance la-
beling accuracy. Ensemble agreement simply refers
to whether multiple algorithms make the same pre-
diction concerning the class of an event (i.e., did SVM
and maximum entropy label the text the same?). Us-
ing a four-ensemble agreement approach, Colling-
wood and Wilkerson (2011) found that when four of
their algorithms agree on the label of a textual doc-
ument, the machine label matches the human label
over 90% of the time. The rate is just 45% when only
two algorithms agree on the text label.”

-from the getting started guide at: http://www.rtexttools.com/documentation.html

Also discussed on a related reddit thread:

“Maximum Entropy and Maximum Likelihood based models are particularly sensitive to smoothing choices in my experience. Good-Turing smoothing would be nice to have, as one example. Or, using term ranks rather than frequencies.”

NOTE TO SELF: this was useful, dont remember why now.: http://www.webology.org/2006/v3n4/a35.html

text

## mechanical turk API, with boto, for crowd classifcaiton?

http://www.toforge.com/2011/04/boto-mturk-tutorial-create-hits/

January262012

text

## Incredible

http://scikit-learn.sourceforge.net/dev/auto_examples/applications/plot_stock_market.html

text

## SQL note to self

CREATE TABLE “sample_profile” (
“id” integer NOT NULL PRIMARY KEY,
“user_id” integer NOT NULL REFERENCES “auth_user” (“id”),
“oauth_token” varchar(200) NOT NULL,
“oauth_secret” varchar(200) NOT NULL
);

January242012

text

## Efficiently using LaTeX on Linux (Ubuntu) for Note Taking

(This post is a work in progress and I only work on it when procrastinating at hte moment)

With the Latex template I am about to provide, and tools I am about to recommend, hopefully you will be able to create a quick amalgamation of note-snippets and pictures from lectures notes, to instantly condense lectures or notes into the bits you need. The interface will be condensed and satisfying;

[PIC]

HOW TO INCLUDE NEW LATEX PACKAGES:

This is

The best Linux IDE is emacs in my opinion, and many bloggers seem to agree with me[1].

You must install both latex (use sudo apt-get install linux , but you should already have it). Then to have latex compatibility in emacs you need AUC Tex (http://www.gnu.org/software/auctex/)

Advantages of AUC Tex are listed here: http://www-stat.stanford.edu/~naras/lunches/f96/talk4/talk.html

Good instructions on configuring emacs to work with AUC Tex are here:

http://soundandcomplete.com/2010/05/13/emacs-as-the-ultimate-latex-editor/

You may need to configure a .emacs file. It should be located in your home directory.

HOW TO SCREEN CAPTURE ONLY BITS YOU WANT

HOW TO SET KEY BINDINGS FOR MINIMUM INTRUSION

The compiler that AUC Tex uses to generate a pdf from the latex code is “latex”, but if you want to include images in your document and avoid a “NO BOUNDING BOX” error you need to change this to “pdflatex”. This straightforward; add this line to your .emacs file (which to open use the command “emacs ~/.emacs”):

(setq TeX-PDF-mode t)

Now if you want to be able to make revision notes fast on different subjects you should make a header file. I’ve uploaded one for you here:

http://mjd96.user.srcf.net/work/REVISION_TEMPLATE.tex

You need to install the correct latex source files from CRAN(?)

text

## More updates soon

I stopped using this tumblr and started using a physical lab book, which must be handed in with the final report apparently.

I will still add some things here, especially as any latex formulae I write may be copied and pasted into a final report.

On the topic of reports, the Technical Milestone Report can be found at:

http://mjd96.user.srcf.net/TMR2.pdf

December72011

text

text

## Christmas just started early.

I’m ecstatic; Amazon.com today has nobly and kindly approved a grant of my project’s use of some EC2 servers, vastly increasing what I could foresee-ably do with networks without having to pay.