Thursday, September 29, 2005
The 4GB file size barrier
Remember when an 80-character punched card was the standard unit of computer storage? The first disk drive I worked with was the IBM 2311 on an IBM-360 computer. It held 7.25MB. So its capacity was somewhere about 45 standard boxes of 2,000 punched cards. Decent, but not huge. That story continues today. Under Windows, depending on software implementation and disk format configurations, the maximum file size can be 2GB (2,000,000,000,000 bytes), 4GB or just under 16 terabytes (gargantuan). 2GB is decent, but no longer huge. Winsteps users are starting to request file sizes larger than 4GB. For this you need Windows XP with the NTFS disk file format (NT = Windows NT = Network, FS = File System) and a hard drive with at least 12GB of available space. A beta-test version of Winsteps reads and writes these huge files.
Sunday, September 18, 2005
40 years of computer programming: ugly, but it works!
It was about March 1965 when I wrote my first computer program. It was in EDSAC II autocode, intended to run on the University of Cambridge computer. But computer time was much too valuable to waste on us students, so the instructor merely glanced at our code and pronounced it satisfactory or unsatisfactory.
About June 1965 I constructed my first operational computer program. I got a vacation job as a programmer at Electronic Associates (a computer manufacturer) working on their analog and hybrid computers. Programs were "written" with wiring on plug boards. We coded real-time differential equations. Programming involved integrators, summers and feedback loops. We simulated oil wells, chemical factories, rockets in flight and such like. Analog computers have since vanished.
In 1968, I was programming accounting systems on IBM 360 computers in RPG. Everything was on 80-column punch cards. Then came magnetic tape drives and, a huge innovation, the Winchester disk drives. (Anyone remember the "datacell" - a direct-access magnetic-tape contraption?)
Around 1980, us old-time programmers started getting worried. Apple and CPM personal computers were becoming available. We thought our days were numbered as all the teenage whiz-kid programmers (who learned programming from their cradles) would soon render us obsolete. Now its 25 years later. Computers are everywhere. But finding a competent, productive programmer is still a challenge - except, perhaps, in India! Good programming requires careful planning, logic and patience - no matter how fast the computer is.
At the University of Chicago, around 1987, I did a computer programming course. It was taught by Professors who had never programmed in a pressure production environment. They were concerned about beautiful code and state-of-the-art algorithms. My concern was fast coding (not fast code), easy maintainability (not beauty) and robust code (which still works when there are errors in the data). Those Professors gave me a barely passing grade. They said "Your code is ugly, but it works!".
Are there any more "ugly" programmers out there with an interest in clients (of all types) with projects (of all types) needing statistics (of all types) implemented in computer programs (of all types)? If so, how about contacting me about joining the Winsteps team .....
About June 1965 I constructed my first operational computer program. I got a vacation job as a programmer at Electronic Associates (a computer manufacturer) working on their analog and hybrid computers. Programs were "written" with wiring on plug boards. We coded real-time differential equations. Programming involved integrators, summers and feedback loops. We simulated oil wells, chemical factories, rockets in flight and such like. Analog computers have since vanished.
In 1968, I was programming accounting systems on IBM 360 computers in RPG. Everything was on 80-column punch cards. Then came magnetic tape drives and, a huge innovation, the Winchester disk drives. (Anyone remember the "datacell" - a direct-access magnetic-tape contraption?)
Around 1980, us old-time programmers started getting worried. Apple and CPM personal computers were becoming available. We thought our days were numbered as all the teenage whiz-kid programmers (who learned programming from their cradles) would soon render us obsolete. Now its 25 years later. Computers are everywhere. But finding a competent, productive programmer is still a challenge - except, perhaps, in India! Good programming requires careful planning, logic and patience - no matter how fast the computer is.
At the University of Chicago, around 1987, I did a computer programming course. It was taught by Professors who had never programmed in a pressure production environment. They were concerned about beautiful code and state-of-the-art algorithms. My concern was fast coding (not fast code), easy maintainability (not beauty) and robust code (which still works when there are errors in the data). Those Professors gave me a barely passing grade. They said "Your code is ugly, but it works!".
Are there any more "ugly" programmers out there with an interest in clients (of all types) with projects (of all types) needing statistics (of all types) implemented in computer programs (of all types)? If so, how about contacting me about joining the Winsteps team .....
Tuesday, September 13, 2005
Differential Item Functioning: DIF
The more papers are published about DIF detection, the more squirrelly ("eccentric, cunningly unforthcoming or reticent" - yourdictionary.com) it becomes. There have been comments that the Winsteps and Facets DIF estimates (computed as interaction terms, after the measure main-effects have been estimated) may be errant. So what is the best method? According to the literature, no method shines. Logistic regression is promising because it provides a DIF size estimate for polytomies, but, at least for dichotomies, "The DIF effect size measures based on logistic regression, however, appeared to be insensitive to the specified DIF conditions." (Hidalgo & López-Pina, EPM 64, 6, 903-915 (2004)).
So what to do? Mantel-Haenszel (1959) is well established for significance and size of dichotomous DIF. Mantel (1963) provides a significance test for ordinal data. These will be included in the Winsteps DIF Table, using "thin" matching based on estimated measures, so missing data can be accommodated. Here is part of the Knox Cube Test output, showing that Winsteps estimates DIF of 1.61 logits where the MH estimate is 1.95:
So what to do? Mantel-Haenszel (1959) is well established for significance and size of dichotomous DIF. Mantel (1963) provides a significance test for ordinal data. These will be included in the Winsteps DIF Table, using "thin" matching based on estimated measures, so missing data can be accommodated. Here is part of the Knox Cube Test output, showing that Winsteps estimates DIF of 1.61 logits where the MH estimate is 1.95:
+--------------------------------------------------------------------------------------------------------+
KID DIF DIF KID DIF DIF DIF JOINT MantelHanzl TAP
CLASS MEASURE S.E. CLASS MEASURE S.E. CONTRAST S.E. t d.f. Prob. Prob. Size Number Name
--------------------------------------------------------------------------------------------------------
F 2.85 .89 M 1.24 .70 1.61 1.13 1.42 32 .1639 .2049 1.95 13 1-4-3-2-4
+--------------------------------------------------------------------------------------------------------+
Saturday, September 03, 2005
Microsoft, Excel and Rasch
Wonderful! Microsoft may be developing Rasch-based scaling procedures in support of obtaining ISO-ANSI accreditation for their certification programs. Perhaps they are using Excel. Mark Moulton has an excellent Rasch demonstration Excel spreadsheet at
www.eddata.com/resources/publications/EDS_Rasch_Demo.xls
Could Microsoft's efforts produce a competitor for Winsteps? I hope so. Adding Microsoft's credibility and consequent publicity would give Rasch methodology a boost and increase the size of the Rasch pie. Bigger pie = bigger slices for everyone.
www.eddata.com/resources/publications/EDS_Rasch_Demo.xls
Could Microsoft's efforts produce a competitor for Winsteps? I hope so. Adding Microsoft's credibility and consequent publicity would give Rasch methodology a boost and increase the size of the Rasch pie. Bigger pie = bigger slices for everyone.