Posted: June 19th, 2010 | Author: sofia | Filed under: curious, open | Tags: context free, javascript | No Comments »
Just found out about the Context Free Art project through it’s javascript port, ContextFree.js done by Aza Raskin, the same guy involved in the development of Firefox. Raskin developed a lovely playful site, Algorithm Ink. I actually showed it to a friend who’s not a developer and she started playing right away : )
Here’s an example: http://azarask.in/projects/algorithm-ink/#aa486456 . Click browse to see lots more.
Posted: November 24th, 2009 | Author: sofia | Filed under: curious | Tags: ner, nlp, português | 1 Comment »
Desde há algum tempo que tenho alguma curiosidade sobre NLP (natural language processing) tanto a nível teórico (como funcionam os algoritmos que usam) como a nivel prático. Não existe por exemplo, um serviço semelhante ao Apture para português. Assim, derivados das minhas pesquisas aqui vão alguns links que achei interessantes:
LXSuite - um conjunto de webservices para a análise linguística de texto desenvolvido pela Universidade Lisboa. Na LxSuite estão todos juntos mas podem ser vistos em separado no LxCenter. Infelizmente, o serviço não tem api e é necessário consentimento para o seu uso.
LXService: Web Services of Language Technology for Portuguese - artigo a descrever o desenvolvimento dos webservices acima.
Linguateca - todo um conjunto de recursos de NLP para a língua portuguesa. O HAREM - NER for portuguese e respectivo livro está aqui. Eles também disponibilizam vários corpus para português aqui, inclusivé o CETEMPúblico (disponível para download gratuitamente).
Portuguese Language Processing Service - um artigo sobre o desenvolvimento dum conjunto de webservices de NLP para língua portuguesa desenvolvido no Brasil. De acordo com os próprios:
In this paper, we describe F-EXT-WS, a Portuguese Language Processing Service that is now available at the Web. The first version of this service provides Part-of-Speech Tagging, Noun Phrase Chunking and Named Entity Recognition. All these tools were built with the Entropy Guided Transformation Learning algorithm, a state-of-the-art Machine Learning algorithm for such tasks.
Este artigo parece interessante e pode ser um ponto de partida para outras explorações (ex. ETL/Entropy Guided Transformation Learning - ver Portuguese corpus-based learning using ETL). Fui ver o F-EXT-WS mas é necessário registo e deu erro.
Alguém conhece outros recursos interessantes dentro desta àrea?
Posted: July 27th, 2008 | Author: sofia | Filed under: curious | Tags: performance | 8 Comments »
Maybe this already exists but since my searches didn’t turn up anything, i thought i’d post this.
You have an app coded more or less rest style. Every post request implies there was a data change (-> cache becomes stale), every get request implies there was no change in the data (-> cache stays fresh). So you know that if a post request was made to domain.com/admin/news, the news cache becomes stale. I won’t go really deep here, in that if you change item 8 of the news table, you might only have 2 stale caches, the one that lists the news and the one that shows item 8 of the news table ( ie. domain.com/news and domain.com/news/8 or domain.com/news/title-of-article) and not every cache belonging to the news group but let’s keep it simple here.
I would like to know if there’s anything out there that parses the apache logs for post requests and if there was a post/put/delete in any url, according to a few configurable rules, it will automatically do a get to the correspondent url. For example, if a post was made to domain.com/admin/news/8 then it would be able to, upon parsing of the apache logs, do a get request to domain.com/news/8, generating the cache for the next user that comes along instead of waiting for the next user to generate a fresh cache - keeping him waiting . It would just increase the cache hit ratio per user. It would of course run as a cron job.
I like this solution because it really keeps the caching code (if cache exists, expire cache, use cache, etc) outside the app, becoming simply another layer, where it really should be.
I really think that this makes sense from a rest perspective so i suspect it’s already out there..
So anyone know of anything? Preferably in php, but python or ruby is ok too.
Thanx :=)