Welcome to Foxite.COM Community Weblog Sign in | Join | Help

The VFP MVP Awards for 2009/2010 have been announced

I am very proud that Microsoft have given me another MVP Award for the year 2009/2010. This is the twelfth time in succession that I have been so honored and I am always conscious of how very lucky and privileged I am to be recognized in this way.

Marcia has also been renewed as an MVP - this is her eleventh successive award - and she too is very proud of her award.

There are always many more people contending for the award than can possibly receive it especially as the number of FoxPro MVPs is dwindling - this year there are only 22 of us - which makes the award even more flattering. We will, as always, try to live up to the standards demanded of us as MVPs.

posted by andykr | 10 Comments
Filed Under:

Best Practices for Coding

 

In my last article I was stressing the importance of testing code with realistic data volumes in order to detect potential performance issues when the code is actually deployed. That got me to thinking about something that I don't normally think about. The quality of the code we write. Now you are probably wondering what on earth I mean by the "quality" of the code that we write? After all, code is either good (it works, does what it is supposed to and doesn't crash) or bad (it doesn't do what it is supposed to, or crashes). However, that addresses only the functionality of the code, not its quality.

One of VFP's greatest features is that there are almost always several different ways of doing the same thing. Unfortunately, this is also one of the worst things about the language because, if there were only one way of doing something, you would have no choice. What I notice, whenever I am working on code (my own or someone else’s) is how easily we developers fall into patterns of doing things. Once we have figured out the way to tackle some particular type of operation we tend to stick to it unquestioningly. I believe there are three critical issues that affect the quality of our code:

  • Maintainability    Ask yourself this question. Could someone (assuming they are a technically competent developer) take over your maintaining and enhancing your code? Is there (current!) design and implementation documentation explaining what the code is supposed to do and how it actually does it? Is the code itself commented so that it is clear what is going on at every point?
  • Efficiency           Code can easily be functionally correct, but perform sub-optimally. The vacation calculator that took 45 minutes to run (that I referred in my last article) is a perfect example of code that worked but was not optimal. As noted, determining whether the code is optimal is largely a matter of proper testing at each level, but that is detection after the fact. What can you do to avoid writing sub-optimal to start with?
  • Best Practices    This is probably the hardest one of all. What are the "best practices' for VFP code? Who decides what they are, and where are they defined? The only answer to these questions is that I don't know.

Therein lies the problem; there are no absolute rules that can be applied to all of these issues, in all cases. As with so many things it is a largely a matter of personal judgment and preference. What I can do is to offer some suggestions, based on my 20+ years of working with FoxPro in its various incarnations (if there are mistakes to be made, you can bet that I've probably made them all at one time or another). So here goes with my top ten rules for writing better code …

Rule #1: Never copy and paste code     Since every piece of code is different - in either intent or implementation (or both), there can never be a situation where you should copy and paste code that was written elsewhere. If you do find yourself copying and pasting then it means that you are in one of two situations. First you have an operation which closely mimics an existing operation. In this case you should probably be designing a more generic solution to handle both operations rather than trying to hack existing code. Or second, you are simply duplicating functionality that already exists elsewhere because, in its current form or location, it cannot be accessed directly. In this case the code should not be duplicated but should be extracted and made available as either a method or procedure to everything that needs to use it.

Rule #2: Don't adopt global solutions to solve local problems There are basically two types of environmental control commands in VFP; those that are scoped to the datasession (e.g. SET EXCLUSIVE and SET DELETED) and those that are global to VFP itself (e.g. SET ESCAPE and SET NULLDISPLAY). In either case these commands are actually global – they affect everything within their scope and so should only be used when you really need to change the working environment as a whole. Consider the following code (from a real application I may add) whose intent was to un-delete records previously flagged for deletion:

PROCEDURE <name>
USE <table> ORDER <tagname>
SET deleted off
SET filter to deleted()
***
***** Functional code here
***
SET deleted on
RETURN

Now, what's wrong here. First, there is absolutely no error checking at all and the USE command is poorly implemented, at the very least an "IN 0" should have been used to avoid inadvertently closing an already open table. However, the bad part is the explicit use of the global set commands. Irrespective of whatever else is going on in the application, all tables in the current datasession will be running with DELETED = ON after this procedure is called. While that is fine when it is the intended behavior, it shouldn't happen by chance. At the very least, the setting should have been checked on entry to the procedure and restore to whatever state was extant at the end. But the real issue is why even use a global command at all? If you want to un-delete records, why not just use RECALL FOR <condition>? It doesn't matter to the RECALL command where deleted is set on or off and there is no need to mess with a global setting at all.

Rule #3: Ensure that Return Values are meaningful      It is often said that the difference between a procedure and a function is that a function returns a value. However, in VFP this is not actually true because the only difference between a procedure and a function is how the code is called, and whether parameters are passed by default by reference (in a procedure call) or by value (in a function call). What this means, therefore, is that all VFP procedures, functions and methods must always return a value to the calling code (if nothing else is specified the value is simply a logical .T.). Any code that might be interpreted as a function must, therefore, return a meaningful value. Here is an example of code that, in the application, is called using the old "=" syntax (i.e. =OpenDbf('table_name'))

****************************************
FUNCTION opendbf
PARAMETER TableName, OrderTag
****************************************
IF not used(TableName)
  SELECT 0
  USE (TableName)
ENDIF
SELECT (TableName)
IF parameters() = 2 and not empty(alltrim(OrderTag))
  SET order to tag &OrderTag
ENDIF

Notice that there is no explicit return value so this "function" will always return "True" – even after an error. What is needed here is some local error trapping (TRY…CATCH was introduced into the language for precisely this kind of situation) and a proper return value that informs the calling code whether the function achieved its objective, or not, so that the calling code can take the appropriate action.

This leads directly to Rules 4 and 5:

Rule #4: Always include an explicit RETURN statement            The reason for this rule is that if you omit the return statement, and the last line of the procedure is a function call, there is no way to see the result in the debugger. For example, when debugging the procedure code above unless you pass the second (OrderTag) parameter the last line of code that you can step into is the "SELECT (TableName)" statement. By including the RETURN command you provide a stopping point for the debugger so that you can evaluate the results of the last block of code too!

Rule #5: Always check the return value from a function call     Possibly the worst example of violating this rule comes from the VFP Help file itself. In the TableUpdate() topic is the following "example":

REPLACE cLastName WITH 'Jones'
? 'New cLastName value: '
?? cLastName  && Displays new cLastName value (Jones).
= TABLEUPDATE(.T.)  && Commits changes.
? 'Updated cLastName value: '
?? cLastName  && Displays current cLastName value (Jones).

This example has caused more developers more grief and prompted more questions on technical forums than any other single item in the help file. Typically you see something like this:

Bug: TableUpdate not working!

My Save button calls TableUpdate and there is no error, but when I look in the table the changes aren't there. What's wrong?

The answer is almost always that the developer based their code on the Help file example! But as stated, quite clearly in the help topic (under 'Return Value'), TableUpdate() like all VFP functions returns a value. If the update succeeds, the return value is TRUE; if it fails, no error is generated but the return value is FALSE and the Windows error structure is populated. The VFP AERROR() function can be used to determine the actual error message. So the correct way to use the TableUpdate() function is:

REPLACE cLastName WITH 'Jones'
? 'New cLastName value: '
?? cLastName  && Displays new cLastName value (Jones).
IF NOT TABLEUPDATE(.T.)  && ATTEMPT to commit changes
  AERROR( laErr )
  MESSAGEBOX( laErr[2], 16, 'Update Failed' )
ELSE
  ? 'Updated cLastName value: '
  ?? cLastName  && Displays current cLastName value (Jones).
ENDIF

Any time I see code that uses "=FunctionName()" I am immediately suspicious. What this is doing is saying that the result of the function call is irrelevant, and while that may be true in certain cases, it is certainly not the norm and is certainly not good practice.

Rule #6: Always check, and validate, input parameters This may seem obvious but, as shown in the OpenDbf() function above this fundamental rule is not always followed. My own preference is to use code like this to validate parameters and while there are many different ways of handling the issue, handled it must be!

FUNCTION AddConnection( tuConName )
  *** Check the input parameter
  IF VARTYPE(tuConName) <> "O"
    *** We didn't get a connection object
    IF VARTYPE( tuConName ) = "C" AND ! EMPTY( tuConName )
      loConDets = .GetConnection( tuConName  )
      IF NOT loConDets.lStatus
        *** Could not find this connection so just return
        RETURN .MakeResobj( .F. )
      ENDIF
    ELSE
      *** Invalid or no Connection Name passed
      This.LOGERROR( 9014, tuConName, LOWER( PROGRAM()))
      RETURN .MakeResobj( .F. )
    ENDIF
  ELSE
    IF LOWER( tuConName.Class ) <> "xconnection"
      *** Invalid or Connection Object passed
      This.LOGERROR( 9014, tuConName, LOWER( PROGRAM()))
      RETURN .MakeResobj( .F. )
    ENDIF
    *** If we get to here we have a valid connection object

  ENDIF

As you can see, this code handles the case where an object of a specific type is expected, but where the name of an object may be passed instead of the reference. Either case is catered for and, while the code could be condensed it is both readable and unambiguous in this format. Which leads me directly to Rule 7.

Rule #7: Don't use 'cute', but obscure, coding  I, like most people, like others to think of me as being "clever" and I regard that as a natural enough desire. However, there is no excuse for projecting that desire into code so that it becomes obscure and difficult to understand. Consider the following piece of code:

DO CASE
  CASE ! CheckParameters( tcName, tlStatus, tdLastDate )
    lcError = "Invalid Parameters:
  CASE ! UpdateName( tcName )
    lcError = "Cannot update the specified name value"
  CASE IsDateOutofRange( tdLastDate )
    lcError = "Date is outside the allowable range"
  CASE ! SetStatus( tlStatus )
    lcError = "Cannot update the status flag"
  OTHERWISE
    lcError = ""
ENDCASE
IF NOT EMPTY( lcError )
  MESSAGEBOX( lcError, 16, 'Error Occurred' )
ENDIF

What this code is doing is simple enough – it is using the CASE construct to call each function in turn because, in each case, the function is expected to return a result that means that the case condition evaluates as FALSE. This forces the next CASE to be evaluated, and so on. However, this is not immediately obvious, and is certainly not a  normal use of the DO CASE construct. Especially in the absence of any comments it would very be hard to pick this out of a program and understand it without considerable effort. Similarly code like this little function, also cute, is hardly readable, let alone maintainable:

LPARAMETERS tnDay
LOCAL lcDay
*** Convert day number to the day of the week
lcDay = IIF( tnDay = 1, 'Sunday', IIF( tnDay = 2, 'Monday', IIF( tnDay = 3, 'Tuesday', IIF( tnDay = 4, ;
'Wednesday', IIF( tnDay = 5, 'Thursday', IIF( tnDay = 6, 'Friday', IIF( tnDay = 7, 'Saturday', ;
'Invalid Day Number')))))))
RETURN m.lcDay

And here's another example – which would you prefer to run across in code you have to debug?

mImg = IIF(FILE(SYS(5)+SYS(2003)+"\Img\"+ m.ic +".Jpg",SYS(5)+SYS(2003);
+"\Img\"+ m.ic + ".Jpg",SYS(5)+SYS(2003)+'\Img\NoImg.Jpg')

Or

IF FILE(SYS(5)+SYS(2003)+"\Img\"+ m.ic +".Jpg")
  m.Img = SYS(5)+SYS(2003)+"\Img\"+ m.ic + ".Jpg"
ELSE
  m.Img = SYS(5)+SYS(2003)+'\Img\NoImg.Jpg'
ENDIF

In each case the result is the same, the specified image file is checked for, and if found assigned to the variable, otherwise a default image is assigned instead. But, in my opinion clarity ( and hence maintainability) is more important than merely saving a couple of lines of code.

Rule #8: Don't create unnecessary functions    What on earth is an "unnecessary" function? Simply it is a function whose only functionality is to call other functions. This example is from some code that someone sent to me recently:

IF NOT ISALLUPPER( String )
  string = UPPER( string )
ENDIF

What, I wondered, was the "ISALLUPPER()" function call for. It's not a VFP function so it had to be a UDF and, opening the code, this is what I found:

*************************************************
FUNCTION isAllUpper
*************************************************
PARAMETERS string
IF UPPER(string) = string
  RETURN .T.
ELSE
  RETURN .F.
ENDIF

This is totally absurd since there is no conditional logic here at all. The result is to always force the string to upper case, so the only code needed is:

string = UPPER(string)

Here's another example:

FUNCTION CityLine
LPARAMETERS lcCity, lcState, lcZipcode
F = ALLTRIM( lcCity ) + ", " + ALLTRIM( lcState ) + " " + ALLTRIM( lcZipcode )
RETURN F

Since all this is doing is concatenating the three input parameters, the code could just as easily concatenate them directly. Admittedly if this is done often and there is a possibility it might have to change, then there is some merit to the function (it concentrates the code in one place), or if there was some other variable that applied some other logic (country-specific formatting for example). However, as it stands it is clearly an unnecessary function and achieves nothing more than to introduce additional overhead into the program.
Rule #9: Don't use Magic Numbers       A 'magic' number is simply some undefined value which is used to interpret data in some way. Again, from a real application I got the following piece of code:

IF m_userlevel < 20
  IF m_userid = m_owner
    m_userlevel = 11
  ELSE
    m_userlevel = 12
  ENDIF
ENDIF

That is all that there was! Now, by interpreting what the code was actually doing I was eventually able to figure out that UserLevel 11 was "Executive" and UserLevel 12 was "Other Manager". But there were no other numbers used (anywhere in the code) and there was no definition – either in code, an include file or in a table – of what these numbers meant. If you need values like this they must be defined in a table somewhere. That immediately conveys four benefits. First, you won’t forget what they mean! Second if you need to change the descrptions for these values, then you don’t need to change any code, you simply change the values in the table. Third, you have a consistent set of definitions which allow you to display meaningful text without having to resort code like this:

DO CASE
  CASE lnUserLevel = 11
    lcDisplayName = ‘Responsible Executive’
  CASE lnUserLevel = 12
    lcDisplayName = ‘Manager’
  OTHERWISE
    lcDisplayName = ‘Unknown User’
ENDCASE  

and finally you have a way to interpret the values for those situations when you don't have the ability to write the code directly (like in a report, or an extract file for example).

Rule #10: Name Methods and Functions appropriately This is another 'obvious' one! However, yet again, it is easy to find examples that show that people don't  do it. Here is an example:

IF IsChange( m.username )
  REPLACE username WITH m.username IN loginusers
ENDIF

Now what would you expect this code to be doing? When I first saw it I assumed that it meant that if the user name has been changed, then update the "loginusers" table with the new user name. However, in the context of the application that didn't seem to make sense, so I went looking. Here is the code called by IsChanged():

PROCEDURE ischange
PARAMETERS string
RETURN change2lower(string)

PROCEDURE change2lower
PARAMETERS string
STRING = alltrim(string)
retstring = ""
DO while not empty(string)
  * Raise first char as upper case
  STRING = upper(substr(string, 1, 1)) ;
         + iif(len(string) > 1, substr(string, 2), "")
  retstring = retstring + getFirstWord(string) + " "
  STRING = removeFirstWord(string)
ENDDO
RETURN retstring

If you examine this you will see that what it does is actually to capitalize the first letter of each word in the input string and put the rest into lower case. So not only are these examples of unnecessary functions - the actual code required is just

REPLACE username WITH PROPER( m.username ) IN loginusers

but the names used are totally inappropriate. Since all IsChange() does is to call Change2Lower() why is it named "IsChange"? Moreover, Change2Lower() does NOT change a string to lower case, it changes it to proper case.

How about this one, a program file named "Productn.prg"? What does it do? Well, here's the only comment from the code:

* Productn - Check for Duplicates

hardly what you would call a descriptive name for the program!

There are lots more things that could be included, but these are my current list of the 'Top 10' rules for writing code:

Rule #1: Never copy and paste code

Rule #2: Don't adopt global solutions to solve local problems

Rule #3: Ensure that Return Values are meaningful

Rule #4: Always include an explicit RETURN statement

Rule #5: Always check the return value from a function call

Rule #6: Always check, and validate, input parameters

Rule #7: Don't use 'cute', but obscure, coding

Rule #8: Don't create unnecessary functions

Rule #9: Don't use Magic Numbers

Rule #10: Name Methods and Functions appropriately

I'd be interested ot hear from anyone with their list....

posted by andykr | 7 Comments
Filed Under:

But it worked fine on my machine!

How many times have you thought that? Or even worse, actually heard yourself saying it?

Come on, now, be honest!

When you hear that, in the context of an application, or some screen or function within an application, what does it really mean? Basically it means that, whatever "it" is, it wasn't properly tested. However, before addressing the question of how we can we avoid finding ourselves in this situation, let's examine the question of "Testing" in a little more detail. What exactly do we, as applications developers, understand by testing?

Now, first, a caveat! This is a topic on which whole books have been written (I liked "Software Testing-A Craftsman's Approach" (Third Edition) by Paul C. Jorgensen) and I am certainly not going to try and get into a comprehensive discussion of testing and testing strategies in this article. However, we can reduce the issue to a number of discrete levels at which testing of any new piece of software should be done, these are:

  • Unit/Module                   This is the initial test whose purpose is to ensure that the code does what it is supposed to do without error. This should be done at the smallest possible functionally complete item (i.e. the form or class level) and is normally, entirely, the responsibility of the developer
  • Integration                    The purpose of integration testing is twofold; first, to ensure that the code continues to function when integrated with other components of the application and, second, that the addition of the new code does not cause problems for other components. This should be done at the lowest level of existing functionality that is directly impacted by the new code (i.e. the calling component or option) and is normally, primarily, a developer responsibility
  • Regression                    The objective of regression testing is to ensure that all existing parts of a system continue to function to specification after the addition of some new component or sub-system. This should be carried out at the highest level of applicable functionality (i.e. the Application as a whole). There are several types of regression testing of which the two most common are "Sanity Testing" (which checks for unexpected or bizarre behaviors) and "Smoke Testing" (which tests basic functionality). Irrespective of implementation, the essence of regression testing is to ensure that existing functionality remains unchanged by the addition of new functionality. This is, essentially, a QA function, although smoke testing by developers should always be an essential step in the build process (i.e. to ensure that the build was successful)
  • User Acceptance (UAT)   This is the acid test for any application or system. The objective is to put the system into a "production" environment and have "users" work on the system in as realistic a fashion as possible. Often this is done in a special QA environment, sometimes it is done by actually installing the software on the client's hardware. Either way, this is not normally within the purview of the developer but it is often the point at which the dreaded phrase "it worked fine on my machine" is heard from the developer

So what is that causes the "it worked fine on my machine" reaction? While there are many possible reasons for something failing to perform as expected, basically they come down to one of two things. Either:

  • The functionality was only tested to ensure that it "worked" (i.e. that it did what it was designed to do under the conditions in which it was developed). This usually indicates a failure at the lowest level of testing because some condition or combination of conditions was not explicitly handled by the code
  • The data with which it was tested did not accurately represent the data against which it really had to work. This represents a failure at all levels of testing above the initial Unit/Module test because there is, at the end of the day, little point in testing anything unless the data with which you are working is realistic

There is nothing really that I can say, in general terms, about the first of these issues. The requirements for defensive coding, graceful and comprehensive error handling and thorough testing are well understood. Unfortunately, all too often they are 'More honour'd in the breach than the observance' (Hamlet, Act 1, Scene 4, by William Shakespeare).

  • Literary Note: Despite common usage, as with many sayings taken out of context from Shakespeare's writings, the meaning is actually quite different from that which we usually take it to be. In this case, it is almost the exact opposite! Hamlet is really saying that it would be more honorable to discontinue the King's practice of holding drunken parties than to go along with, and thereby condone, it.

The second, however, is something that we can address. We absolutely must ensure that we have realistic data both in terms of quality and quantity to test against. Here are a couple of examples of what can happen if we don't:

Several years ago, Marcia and I worked on an application (built by someone else) whose initial interface was an empty screen into which the user typed criteria for finding a client. When the search string was submitted the application displayed the standard Windows "Searching" animation (you know, the one where a torch moves back and forth, illuminating folders) and, after a few seconds, the screen would be populated with the result set. So what's wrong with that, I hear you ask? Well, nothing really, except that, in the test data there was only one client. The fact that it took 'a few seconds' to return the result was, surely, an indication that something was amiss.
It turned out, on investigation that there was a three-second wait programmed into the code. Probably it was added by the original developer to test the animation, but somehow it got left in the final version of the code that got checked into source control! Had someone correlated the observed behavior with the size of the dataset this would have been picked up immediately – but no-one did.

Another example comes from a very large company that had a process that calculated vacation entitlement for each employee based on their years of service and actual vacation/leave of absence time taken in the previous 5 years. This was a complex piece of processing (involving processing the timesheet history for the entire 5-year period) but, in the development environment, it worked and gave verifiably correct answers in about 12 seconds which was, for an occasional process, considered to be  acceptable. However, after a couple of years in production the same process on the live system was taking up to 45 minutes – which was not acceptable!

So what was the difference? In a word, Volume!

In the test environment there just under 34,000 rows in the timesheet database but, in production there were almost 10,000,000 (The timesheet system generated two records per day for each employee. So for five years of history for 2,500 or so employees, you get around ten million rows).  This is a classic example of a test environment that simply was not realistic! The 34,000 rows represented about 7 days worth of timesheet data for the 2500 employees (less than 0.35% of the production volume) – hardly a representative sample for a process that had to run against 5 years worth of data! So while the code worked acceptably with the test data, it totally failed in Production when confronted with real data volumes.
The cause (not surprisingly) turned out to be poorly optimized code. Using the coverage profiler in the test environment we could see that the bulk of the time (8.96 seconds out of 12 seconds  -  almost three-quarters of the total processing time) was spent doing replacements into three fields in the same table using one replace statement per field! To make it worse this was being done inside a unfiltered SCAN loop! Why use multiple REPLACE statements rather than one single REPLACE that updated all affected fields and why an unfiltered SCAN instead of REPLACE ALL?

Answer: Probably because the programmer originally had other stuff inside the scan loop, and later changed their mind but not their code!

Replacing the SCAN/REPLACE with a simple REPLACE ALL produced identical results to the original – but in less than 1 second compared to the original 12 seconds. Along with some other improvements to the code and the addition of proper indexes on the tables we managed to get this process down from around 45 minutes to less than 2 minutes on the live system.

So, what can we do about it. The first, and most obvious solution is to ensure that development and testing environments use copies of actual production data whenever possible. The data does not have to be 'real time' but must be current. Many companies use a daily, or weekly, refresh process to update their development environments to ensure that new code can be thoroughly and realistically evaluated. However, this is not always a simple task – especially when sensitive, or confidential data is involved. While it is possible to 'scrub' data, to remove confidential or sensitive values (like replacing real Social Security Numbers with invalid but realistically structured values, real Email addresses with standard in-house addresses and so on) this has to be done properly. I once worked on a company's data that had been poorly scrubbed; all Email addresses were replaced with the company's own EMail, all SSNs with the same "888-999-1111' value, all account numbers with "0123456789" and so on. The result was actually worse than useless since you could not actually validate anything in the test data because, when all values are the same, all results are "correct" (but meaningless!).

There is another issue with using and refreshing copies of real data. This is when you have "standard" test data that must be present (used in regression tests for example) or data that must be entered through the application for some reason. It is possible to address this by ensuring that all such data is 'scripted' out so that it can be re-applied after a refresh, but this is not a satisfactory solution (apart from anything else, it's too easy for it to go wrong).

A better solution, in my opinion anyway, is to generate realistic-looking, but spurious, test data in the first place. This is not really as hard as it might appear at first glance because VFP is very good at this sort of processing. For example, one of the commonest requirements for any application is to handle names and addresses, so let's see what we can do to generate some "test data". The first thing is that we will need some metadata to use. I created, for this purpose a set of tables:

  • TITLES             A set of standard address title and an associated gender indicator. Thus "Miss" is defined as "Female", while "Mr" is  "Male". Combinations like "Mr & Mrs" are also "male", while "Prof" exists twice once as male, and once as female. This is used to choose a name based on the form of address selected
  • FORENAMES      A list of forenames each of which is associated with a gender indicator. The gender indicator is used, as noted above, to associate a name with the selected title
  • SURNAMES        Simply a list of possible surnames. There are no associated fields with this table it is simply a list
  • PLACES             Another list of names, this table contains the list of candidates for use as the name of a street
  • STREETS           Yet another list, this time of street name suffixes (Avenue, Boulevard etc)
  • CYSTZIP            This is the most complex table of all since we need to ensure that City/State/Zip and Telephone Area Codes are correctly matched (otherwise we would not be able to test validation routines!). So this table contains real data from my standard zip code master table

Having got our metadata all we need is some code to generate random numbers and select the record whose record number matches as "data". A simple function returns  a random number between two specified values:

FUNCTION GenNum (tnLoVal, tnHiVal )
LOCAL lnSel
 lnSel = ROUND( (tnHiVal - tnLoVal) * RAND(), 0 ) + tnLoVal
 lnSel = IIF( lnSel <= 1, 1, lnSel )
 RETURN lnSel
Now in order to generate our data we need a target cursor: 

CREATE CURSOR gendata ( ;
  iPsnPK INTEGER(  4 ), ;
  cTitle VARCHAR( 10 ), ;
  cFName VARCHAR( 30 ), ;
  cInit  VARCHAR(  1 ), ;
  cLName VARCHAR( 30 ), ;
  cSex   VARCHAR(  1 ), ;
  cAddr  VARCHAR( 30 ), ;
  cCity  VARCHAR( 20 ), ;
  cState VARCHAR(  2 ), ;
  cZip   VARCHAR(  5 ), ;
  cPhone VARCHAR( 12 ), ;
  dBorn  DATE( 8 )) 

Now it's a simple task to generate our data. First we grab a title (and the associated gender indicator) at random by calling the GenNum() function and passing the limits for the Title table: 

  lnSel = GenNum( 1, 10 )
  GOTO lnSel IN titles
  lcSex = ALLTRIM( titles.cSex )
  lcTitle = titles.cTitle

Next we use the gender indicator to get the first name, and middle initial. (Note, I set the forenames table up and by sorting the names on "gender + name", but this could easily be done in real time using queries to build cursors of names by gender and selecting from the appropriate one. Since my data is static I didn't bother in this case).
  *** Now a suitable first name
  lnSel = IIF( lcSex = 'F', GenNum( 1, 241 ), GenNum( 242, 436 ))
  GOTO lnSel IN forenames
  lcFName = forenames.cName
 
  *** And a middle initial
  lnSel = IIF( lcSex = 'F', GenNum( 1, 241 ), GenNum( 242, 436 ))
  GOTO lnSel IN forenames
  lcInit = LEFT( forenames.cName, 1 )

We generate the Surname, Street Name and Street descriptor in precisely the same way from the Surnames, Places and Streets tables respectively. The street number is just a random number between 1 and 9999. Next we need to get a set of City/State/Zip and Area codes – chosen at random from the cystzip table – and then generate exchange and phone numbers as random numbers in the range 100 to 999 (for exchange) and 1000 to 9999 (for phone number) respectively. We now have all the elements of the address and phone number.

The last remaining piece is a data of birth. This is a little more tricky, but can be done in various ways – I opted for the simplest, generate a year at random between 1918 and 1989 (to give ages in the range 20 – 91) and I handled the date by only allowing days 1 through 28. Obviously if Date of Birth was critical to your application you would need  a more sophisticated algorithm to generate the dates, but for simple test data this works for me.

I didn't need Social Security Numbers in this set, but they are just as easily generated as any other number. In this case though we should take care not to use 'valid' numbers (as of July 2009 a valid SSN cannot have an area number - the first three digits - between 734 and 749, or above 772; so providing we only generate numbers that use these "invalid" ranges we can never hit on a real SSN - even by accident)

The whole code is wrapped in a loop and so I can generate any number of records that I want simply by passing in the required number of records. On my machine, executing "DO GenData WITH 550" Generates 550 random names and addresses in about 0.5 seconds! Larger data sets take longer, of course; 55,000 names takes about 5 seconds and 1,000,000 about 90 seconds! As I said, VFP is very good at this sort of processing!

Remember too that these names and addresses are "valid" in the sense that city, state and zip code are real (and match), and that the area code is correct for the zip code, but everything else is randomly generated. Of course it is possible that one might occasionally hit on a "real" address (in which the randomly generated street number and name exactly match a real address in a real city/state/zip) but then chances of also generating a name that really is associated with that address,  let alone the actual phone number, are infinitesimally small.

With a little bit of thought, and planning, you can develop similar routines for almost any set of data that you will ever need. Do this, and you will never again run into the situation where your test data is not realistic, or adequate.

The attached zip file includes my metadata tables and the code for generating the names and address cursor. As always, please feel free to modify and improve on my stuff. Just let me know what you do with it so that I can benefit too.

posted by andykr | 5 Comments
Filed Under: ,
Attachment(s): GenData.zip

Properly Formatting Text Strings

One of the perennial issues that we all encounter from time to time is that of properly formatting words that have been entered into a database. I am sure that, just as I have, you will at least once have run into the situation where you have to produce a personalized letter, report, mailing label or some such printed output that uses names from the database and have ended up with something that looks like this:

Mr JEFFREY WILSON
123 LETSBE AVENUE
seldom
Wilts

 

Which of course is the result of using lookup keys for the Title and County and direct user input for everything else. By and large, VFP is pretty good about string handling and it even provides a PROPER() function which will easily handle this example, turning it into:

Mr Jeffrey Wilson
123 Letsbe Avenue
Seldom
Wilts

However, things are not so easy when we have to deal with a specially formatted word, or name, like "O'Reilly" (comes out as "O'reilly") or "MacDonald" (ends up as "Macdonald"). There have been, over the years, various attempts to handle these, but none are totally satisfactory, although some are very close. However, the situation gets worse when you move away from simply dealing with names and addresses and have to apply formatting to longer strings of strings of text - like paragraph headings in a legal document.

Here's one I had to deal with recently:

PRE-EXISTING CONDITIONS THAT AREN'T SPECIFICALLY COVERED BY SECTION 12A (SUB-SECTION 2)

Applying the PROPER() function resulted in:

Pre-existing Conditions That Aren't Specifically Covered By Section 12a (sub-section 2)

Whereas what I really needed was:

Pre-Existing Conditions that aren't Specifically Covered by Section 12a (Sub-Section 2)

So I sat down to figure out exactly how best to tackle the problem. I quickly realized that there are three basic scenarios which could apply to any given 'word'. First it could simply follow the standard rule and have the first letter capitalized. Second, it could be a word that is not simply capitalized but has special rules, like Scottish names. Third, it could be an exception to the first two and either should not be capitalized at all (words like "that" and "aren't") or that require special, non-rule-based, formatting (names like 'DuMaurier' and 'de Torres', or specialized words like "FoxPro" and "SQL" )

It seemed that the simplest way to handle these (non-rule based) exceptions was to create a table and define them. So that's what I did. The table, named changecase.dbf has the structure shown at Table 1 and is shown in Figure 1:

So much for the exceptions, how about those Scottish names? Well, several years ago, Sue Cunningham sent me a routine that she used for managing Scottish name formatting and I adopted that for my own use. The code is very slick and rather than trying to define all possible names uses a series of tests to determine if the word is a recognized Scots name. Here is my version of Sue's code:

********************************************************************
*** [P] CHECKSCOTS(): Check for standard Scottish names
********************************************************************
PROTECTED FUNCTION CheckScots( tcInWord )
  LOCAL lcOutWord, lnLen, llIsScots, lcTest
  *** Note: Space marker has already been removed in calling routine
  *** So the length here is the true length of the name
  lcOutWord = ALLTRIM( tcInWord )
  *** Check for the shortened form first   
  IF UPPER( SUBSTR( lcOutWord, 1, 2 )) == 'MC'
    RETURN 'Mc' + PROPER( SUBSTR( lcOutWord, 3 ))
  ELSE
    *** Process the word through the parser
    lnLen = LEN( lcOutWord )
  ENDIF

  *** Need to test the names in descending order of length to eliminate
  *** the need to be too explicit
  IF ! llIsScots AND lnLen >= 7
     lcTest = UPPER( LEFT( lcOutWord, 7 ) )
     llIsScots = INLIST( lcTest, 'MACADAM', 'MACCAFF','MACCARL','MACCLOS', ;
                 'MACCONN','MACCRAC','MACCULL','MACHENR', ;
                 'MACLANE','MACLEAN','MACLEOD','MACLAUG')
  ENDIF
  IF ! llIsScots AND lnLen >= 6
     lcTest = UPPER( LEFT( lcOutWord, 6 ) )
     llIsScots = INLIST( lcTest, 'MACART','MACAFF','MACINT', ;
                 'MACIVE','MACKAY','MACKEN','MACLAR','MACRAE','MACWIL')
  ENDIF
  IF ! llIsScots AND lnLen >= 5
     lcTest = UPPER( LEFT( lcOutWord, 5 ) )
     llIsScots = INLIST( lcTest,'MACKA')
  ENDIF
  IF ! llIsScots AND lnLen >= 4
     lcTest = UPPER( LEFT( lcOutWord, 4 ) )
     llIsScots = INLIST( lcTest, 'MACB','MACC','MACD','MACF','MACG', ;
                 'MACM','MACN','MACP','MACT','MACV')
  ENDIF

  *** If this is a Scottish Name, format it correctly
  IF llIsScots
    lcOutWord = 'Mac' + PROPER( SUBSTR( lcOutWord, 4 ))
  ENDIF
  RETURN lcOutWord
ENDFUNC

That takes care of the names, and the exceptions, which now leaves only the question of how to parse the input string and handle the simple capitalization. The way I do this is to replace all spaces in the input string with a non-alphanumeric character (I use CHR(96)). This marks the position of any original spaces in the string. The next step is to parse the entire string one character at a time and add a space immediately after any character that is neither a letter or a number.

Now, you will be thinking, why on earth would he remove spaces, and then add them back? The answer is to catch any embedded characters like apostrophes or hyphens. After running this input string

PRE-EXISTING CONDITIONS THAT AREN'T SPECIFICALLY COVERED BY SECTION 12A (SUB-SECTION 2)

through my spacing routine, it now looks like this:

PRE- EXISTING` CONDITIONS` THAT` AREN' T` SPECIFICALLY` COVERED` BY` SECTION` 12A` ( SUB- SECTION` 2)

The result, as you can see is to separate out, into "words", the partial words that were previously hidden. Now the process is straightforward, I simply use the native GETWORDNUM() and GETWORD() functions to step through the string one "word" at a time. First I check to see if the 'word' exists in the formatting table – if so, I apply that formatting otherwise I simply  apply the native PROPER() function. Finally, before restoring the word I check to see if it is a Scottish name. After this process the string now looks like this:

Pre- Existing` Conditions` that` aren' T` Specifically` Covered` by` Section` 12a` ( Sub- Section` 2)

Now I can restore the original spacing by removing spaces and then replacing all occurrences of CHR(96) with a space. The result is:

Pre-Existing Conditions that aren'T Specifically Covered by Section 12a (Sub-Section 2)

The final check is to remove any characters that were processed as if they were single character words but which are now terminal letters (the "T" in "aren't" is an example here). This is done by looking for an apostrophe in position 3 or more of each word in the string. If there is one, then everything after the apostrophe is forced to lower case. The final result of my test string is, therefore:

Pre-Existing Conditions that aren't Specifically Covered by Section 12a (Sub-Section 2)

Which is exactly what I needed. The code is packaged up as a class, based on the Session class (so that it's table won't interfere with anything else in the environment) and is written so as to use GetWordNum() and GetWord() under VFP Version 7.0 or higher, or to use the equivalent functions from FoxTools for Version 6.0 or earlier. The calling syntax and interface are very simple:

oFormat = NEWOBJECT( 'xChgCase', 'changecase.prg' )
oFormat.Formattext( [old macdonald had a farm, isn't that cute?])
Result = Old MacDonald Had a Farm, isn't that Cute?

Note that one consequence of the my processing is that I need to include partial words like "aren" and "isn" in the formatting table to prevent them from being capitalized inappropriately – but of course I don't need to differentiate between "it'll" and "it'd" because my terminal capital handling forces everything after an apostrophe in the third position to lower case anyway.

The code, and my formatting table, are included in the zip file attached to this column. As always, please feel free to modify and improve, and please share your improvements.

posted by andykr | 1 Comments
Filed Under: ,
Attachment(s): ChangeCase.zip

So, what's so bad about Public Variables?

Every once in a while the question about whether Public Variables are something that should be used in VFP. Now, there are lots of opinions on this matter and there are several arguments that come up time and time again in defense of using Public Variables. Here are a few:

  • "I use public variables to hold application wide data - global data - and have NO problem with them"

The only possible answer to this one is "Well good for you!" This is not an argument for using Public Variables (or anything else for that matter). This is the same as the Clothes Shop Salesman I once ran into who, when I asked for a pair of plain black trousers, (i.e. with no cuffs or pleats) told me that "Pleated front trousers are our most popular style" . Maybe it was true but that wasn't a reason for me to buy them when I had specifically asked for trousers without pleats!

Just because in your particular environment you have never run into problems with Public Variables, does not mean that there are none. All it says is that you don't understand the issues.

  • I name all global/public wide variables so that they begin with a 'g' - and never define local variables with a name beginning with a 'g'

This simply makes no sense and shows only that the person has no understanding of how VFP handles variables  - because when you define your variables there is no issue anyway. VFP handles the scope appropriately all by itself, problems arise only when you do NOT explicitly define your variable scope.

Thus, if you declare a variable as LOCAL with a name that happens to clash with a Public Variable, VFP only accesses the local variable (for as long as it remained in scope). If you re-declare an existing global variable as PRIVATE, then VFP automatically hides any existing variable of the same name and again, only accesses the declared variable for as long as it remains in scope. You cannot re-define a PRIVATE variable as LOCAL (you get an "Illegal Re-definition Error) so that is no issue either. Paste the following code into a PRG and run it from the command line. You will see the results in Figure 1:

CLEAR
*** Re-define a Public Variable as Private and assign a value
PUBLIC gcPath
gcPath = "G:\VFP90"
? 'Public: ' + gcPath
PRIVATE gcPath
gcPath = "P:\VFP90"
? 'Private: ' + gcPath
LIST MEMORY LIKE g*

*** Re-define a Public Variable as Local and assign a value
CLEAR MEMORY
PUBLIC gcPath
gcPath = "G:\VFP90"
? 'Public: ' + gcPath
LOCAL gcPath
gcPath = "L:\VFP90"
? 'Local: ' + gcPath
LIST MEMORY LIKE g*

*** Re-define a Private Variable as Local and assign a value
CLEAR MEMORY
PRIVATE gcPath
gcPath = "G:\VFP90"
? 'Public: ' + gcPath
LOCAL gcPath
gcPath = "L:\VFP90"
? 'Local: ' + gcPath
LIST MEMORY LIKE g*

 

Figure 1: VFP Manages variable scope!

  • Public variables are part of the VFP language and so they should be used

That is a completely specious argument and has no relevance whatsoever! After all @...SAY and @...GET are also "part of the VFP language", and so is the ACCEPT, SAVE and RESTORE SCREEN and CREATE COLOR SET commands. Do you use them too?

Basically the concept of a Public Variable is a hang-over from the earliest versions of FoxBase, which was designed to run in the DOS environment where there was only ever a single screen and you navigated through the application by opening and closing screens.  Global variables were needed in that environment to persist values between screens that were totally independent and disconnected.

However, this is not the environment in which we work today. There is absolutely no reason to use inappropriate methodologies just because they exist. After all, just because you CAN do something, doesn't mean you SHOULD do it.

  • Their main advantage is that they are ALWAYS available not matter what objects are released; intentionally or otherwise. Especially when a system crashes. This makes them good for tracking user paths through software and reporting under an On Error routine

Again, this is completely irrelevant. All variables that are in scope are always available in an error handling routine. Whether they are public or not makes absolutely no difference. The only issue is whether your error handling is properly designed and constructed or not.

But enough of this 'defense of the indefensible' let's get down to the nitty gritty of why Public Variables are a bad idea in an object-oriented,  form-based, event-driven application environment. There are essentially three issues:

  • Public Variables break encapsulation                    A fundamental principle of good object oriented programming is that an object should not depend on external entities for any its function. That means that if a value is required, that is not a part of the object's definition, it should be explicitly passed to the object when that object is invoked
  • Public Variables can be overwritten                      By definition a Public Variable is not only accessible at all levels of the application, but can also be changed at all levels of the application. In my little example code above, omitting the declaration line in either PRIVATE or LOCAL case would have resulted in the value of the Public Variable being changed. Since the variable scope is global, that change is also effective globally
  • Public Variables have a single value                     One of the commonest reasons for using Public Variables is for defining 'global' data. Of course this is fine as long as there is only ever a need for one value of each global data item at a time.  As soon as you need two instances of an object, but with different values for the Public Variable in each, you are trouble

Let's look at each of these in more detail.

The OOP Aspect

The crucial issue that must be addressed when considering the design of any class is "What are the responsibilities for objects based on this class?" Once the responsibilities are identified, then it is easy to figure out what data the object needs to manage and what it needs access to, but does not control directly. Anything in the latter category must be passed into the object when it is invoked and either acted upon immediately, or saved to an internal property for future use.

Either way, the one thing that you should never see in any object code is something like this:

This.Value = lnSomeValue * gnMyVar

Obviously "lnSomeValue" is a local variable (and that is fine because it is an internal reference) but "gnMyVar" is presumably an external "global" value. Since it is being used in a calculation we assume it is numeric but there is no reason why it should be! The following code  is perfectly valid in VFP:

PUBLIC gnMyVar
gnMyVar = "Fred"

Now what will be the result of the code that ran the calculation? A data type mismatch error!

Obviously it is very important that this value be numeric! So how would assigning it to a property help? Well, one possibility is that a property (unlike a Public Variable) can be made strongly typed by using an Assign Method. So if, instead of accessing the variable directly the object code had been:

This.Value = lnSomeValue * This.nValueToUse

we could then create an assign method on the property like this:

LPARAMETERS tuNewValue
*** Must be Numeric
IF VARTYPE( NVL( tuNewValue, '' )) = "N"
  *** OK, assign it
  This.nValueToUse = tuNewValue
ELSE
  MESSAGEBOX( 'NewValue must be numeric', 16, 'Data Error')
ENDIF

Now we need worry about it no more. Whenever anything tries to assign a value to the property it will be checked and, if not numeric, will not be saved. Now we may still get an error message if the source for this value is incorrect but the crucial difference is that the object now controls its data, and is not relying directly on the randomness of an external value.

This is what is meant by encapsulation and it is not "merely a theoretical" issue. It is actually the basis for creating robust, bug-free, reliable applications. When applied properly there is no longer any possibility of 'unexpected' errors – however they may be caused because each object controls its own destiny.

(Note: From Wikipedia:

Object-oriented programming has roots that can be traced to the 1960s. As hardware and software became increasingly complex, quality was often compromised. Researchers studied ways to maintain software quality and developed object-oriented programming in part to address common problems by strongly emphasizing discrete, reusable units of programming logic. The methodology focuses on data rather than processes, with programs composed of self-sufficient modules (objects) each containing all the information needed to manipulate its own data structure

……

An object-oriented program may thus be viewed as a collection of cooperating objects, as opposed to the conventional model, in which a program is seen as a list of tasks (subroutines) to perform. In OOP, each object is capable of receiving messages, processing data, and sending messages to other objects and can be viewed as an independent 'machine' with a distinct role or responsibility. The actions (or "operators") on these objects are closely associated with the object. )

Public Variables can be Overwritten

As noted in the introduction, Public Variables are totally exposed. There is no control over them whatsoever and any value can be assigned to a Public Variable at any time. Any change affects any object that accesses the Variable irrespective of how, when or where. The following is an absolutely true story (only the names have been changed to protect the innocent):

Many years ago I was involved in debugging a Vehicle Fleet Management application for a major company in Europe. The application had been running for several years without error in production, but had suddenly started crashing occasionally for no apparent reason. There was no pattern – it would run for days, even weeks without error but then for no obvious reason would crash with a "Data Type MisMatch" error. The crash always happened in a form, but not always the same form. Oddly enough the one form which never seemed to crash was the "Select Vehicle" screen and that, after many days effort, provided the clue.

The problem was the fact that the application defined a public variable named "curveh" (i.e. the "current vehicle") that was used to store the ID (an integer) of the currently selected vehicle. Unfortunately, a newly hired developer, had added a new report to the application that defined a report variable named "curveh" which was used to store the license plate number (i.e. a character string!!!) of the vehicle currently being processed by the report (it was printed out in the group footer I seem to recall, but I wouldn't swear to it) .

Of course, this overwrote the system level Public Variable's integer value with a character string and so, after running this report anything which accessed, without first setting, the "curveh" variable (which was almost everything in the system apart from the "Select Vehicle" screen) would immediately crash the app with a "Data Type Mismatch" error.

As long as you didn't run the report, the app was fine and everything worked. Run the report and the app crashed immediately! But detecting the correlation between the report running and the app crashing was tough to do since it was an "occasional" report to begin with.

This is the major issue with Public Variables, and it is one that is avoided totally by using an application object that has properties (with Assign methods) for all system critical values. Of course, you can never prevent a "valid but wrong" scenario (i.e. the Vehicle ID is an integer but is not the correct one for the vehicle you wanted) but at least you can prevent the data type of the property from becoming invalid.

The introduction of support for NULL, along with the LEFT OUTER JOIN syntax, to VFP added another nail to the coffin of Public Variables. In earlier versions of FoxPro, there was no NULL support and so there was never any issue with data that evaluated to null. This is no longer the case and even if you don't explicitly set a value to an invalid type, you can still get errors, or worse, no error and bad results,  when global variables are used to store values derived from SQL queries that involve outer joins. In this case the issue is even more insidious because, in VFP, once a variable has been assigned a value it retains that type even if the actual value is NULL. Thus:

x= 'Fred'
? TYPE("x")  && = "C"
x = NULL
? TYPE("x")  && = "C" !!!

So with your variable, even testing it's TYPE will leave you with potential errors. Notice that in the Assign method above I used NVL() to check the value! Again, the use of a property instead of a public variable avoids this issue too.

Public Variables have a single value

This one is so obvious that it needs no real explanation. If you have an application where the same form can be opened multiple times – for example where several instances of the 'tombstone' form (i.e. a supplier or a customer's, Name, Address, Phone, Fax, Email Address) may be on screen at any time, public variables are useless for holding information for the form since the forms may well be using different tables, in different directories, will certainly have different "current" IDs and so on and so on.

However, it's not just when you need multiple instances of the same form. The issue can also arise when an application requires that more than one form be open at the same time but draw their data from different locations, or simply have different "save to" locations, or any one of a dozen different items. In other words using Public Variables for global data is only practical in single-instance "sovereign" applications (where the current form fills the screen and there is never any other form available unless it is directly called from, and relates to, the main form). Of course, this type of application is exactly how the DOS world worked, and where Public Variables were intended to be used. 

So what should we do?

The first thing to do is to ensure that all variables, irrespective of their scope, are always explicitly declared. Not only is this generally a good practice, but VFP does have, through IntelliSense, the ability to display the list of parameters and currently declared local variables (in a code, or method, editing window type 'zloc' and you will get the list from which you can insert the required name into your code). For a full discussion of Scope see my blog entry http://weblogs.foxite.com/andykramek/archive/2005/05/04/421.aspx

Second, use either your application object, or create a special "globals" object, and give it properties for your globally required information (some people like to use _Screen but, although you can add properties, you cannot add methods to _Screen and it is useful to be able to use assign methods to handle data typing and validation). Then in your application startup program create this object (using a PRIVATE declaration) so that it is available throughout the application. When you need access to a value you have one place to go for it. More importantly by using your own object  you can, as I indicated earlier, use Access and Assign methods to enforce strong typing and prevent invalid values from ever being saved.

Third, whenever you find yourself thinking that you need a public variable, ask yourself the question: Is this the ONLY way to solve my problem? The answer will, almost certainly be "no"! I know of only one scenario where I routinely use a Public Variable. This is when I want to make program to test a class definition that is defined in a PRG so that I can simply "do" the program from the command window. Of course, all variables created in the command window are Public anyway, so in this case it really doesn't matter. Here is what I mean:

RELEASE oTest
PUBLIC oTest
*** Instantiate the class to test
oTest = CREATEOBJECT( 'myclass_to_test' )
DEFINE CLASS myclass AS custom
*** Class definition here

As you can see, this simply creates, and leaves behind an instance of the class that is defined. More importantly that is the whole point of a Public Variable! It has persistence in the environment beyond the scope of procedure or method in which it was created. When this is your requirement, then a public variable really is the only answer. However, how often is this really the requirement? In a runtime application environment I cannot even begin to conceive of a case where this would be necessary.

Finally, the attached zip file contains the code for my variable manager class. This is a little class, defined on the Session base class (so it runs in its own datasession) that uses a metadata table to define a set of variable names and values. It exposes three methods:

  • ListVars() that simply returns a string with the list of all variables and their current values
  • SetVar( varname, varvalue ) that sets a value on a defined property (and creates it if it doesn't already exist)
  • GetVar( varname ) that retrieves the current value of the specified variable

What this means, of course, is that I can now standardize my "global" variables by adding records to a metadata table instead of explicitly defining them in code (how often have you used different names for the same variable in different scenarios? I know I have, but not any more.) It also means that I have a standard methodology to define, retrieve and update variables that need to be available globally. So in my code, instead of:

This.cPath = gcDataPath

I can use:

This.cPath = oVarMgr.GetVar( 'cdatapath' )

The extra few keystrokes are worth it to me because I know that any critical value that I get from my variable manager will at least be valid because the only way a value can get there is through an explicit call to the manager object – and that is not something that will ever happen by accident – either in code, or in a report, or anywhere else.

As always, please feel free to modify the code to suit yourself and if you come up with an improvement, please share it.

posted by andykr | 2 Comments
Filed Under: ,
Attachment(s): variableclass.zip

The Great US Health Care Debate

Regular readers of my blog know that I don't comment on political or social issues (in fact I have only done so once in the past 5 years and that was on another subject about which I also feel strongly).

However, the current debate over Government-mandated Health Care in the USA and the proposals currently before Congress, is another subject on which I feel strongly. So, in case some of you might be interested to read the text of a letter that I sent to my Congresswoman, Representative Betty Sutton, and to my US Senator, Sherrod Brown I post the contents here. I will post any replies that I receive from my Congressional Representatives.

--------------------------------------------------------------------------------------------------------------

8/30/2009: FOLLOW-UP:

Well, here we all are almost three weeks later and I have not even received an acknowledgement from Senator Sherrod Brown. Representative Betty Sutton's office sent an automated reply acknowledging receipt of my message, but no further follow-up from her office either. 

Representative Sutton did hold a telephone "town hall" meeting (barely advertised!) at which she delivered a speech straight out of the Party playbook and ducked all meaningful questions.

Apparently Sen Brown will be holding one (and only one) meeting on 1st September which, according to his website, will be
"a public forum on health insurance reform at the University of Cincinnati on Tuesday, Sep. 1, 2009. During the forum, entitled "Health Insurance Reform - What's In It for You?", Brown will outline how health insurance reform will reduce private insurance premiums and out-of-pocket health care expenses, while giving all Americans insurance options during periods of unemployment"
Apart from the fact that he is clearly not seeking the views of his constituents, but merely delivering (yet another) "Party Line" speech, the choice of venue is interesting. Cincinnati (great city though it may be) is as just about far away from all the other main centers of population in Ohio as you can get and still remain in the State!

In fact, according to the last census, less than 20% of the state's population live in the "Southwestern Metropolitan area" (i.e. Cincinnati & Dayton) while more than 30% reside in the North East (Cleveland/Akron/Canton/Youngstown) and 20% in the Columbus region. I have to admit that picking a location that is, at best, problematic, for more thatn 80% of your constituents to get to should reduce the chance of hearing dissenting voices considerably, Senator .

Looks like my basic question has been answered, these people obviously have NO intention of doing anything other than representing their Party. The concept of representatives basing their actions on how best to represent their constituents has, apparently, ceased to play a role in the body politic - at least for these two! 

--------------------------------------------------------------------------------------------------------------

NOTE: This is all that I have to say on the topic (I am not looking to provoke discussion - this is merely what I sent to MY Representatives) and I will not accept comments on this blog article.

As a resident of NE Ohio,  and hence one of your constituents, I am writing to enquire what your intentions are in respect of the current Administration's headlong rush to pass "health care reform".  Specifically I am interested to know:

  • Whether you, personally, have actually read the proposed bills(s) and understand them
  • Whether you intend to actually consult any of your constituents, or even seek the opinion of the residents, of the state that you represent – if so, when and how?
  • When you intend making a decision on how you will vote, and on what you will base that decision

As an immigrant to the United States some ten years ago and former UK resident I have personal experience of growing up and living under a single-payer, Government-run, health care system. Furthermore my brother-in-law is actually a General Practitioner (i.e. Primary Care Physician) in the UK's National Health Service so not only do I have 40+ years worth of personal experience as a user, I also have some insight from the provider's perspective.

What is concerning me now is that there appears to be a strong sense that, somehow, the United States Government will avoid the errors and pitfalls that I know beset the UK system (and believe similarly affect the Canadian, French and German systems to name but three others). These include, rampant bureaucracy, rapidly increasing costs, long waiting lists for even seeing a doctor and rationed care that is based not on medical judgment but on some bureaucratic 'efficiency formula' which is not even open to public scrutiny or question.

The evidence that the US Government will succeed where others have failed is, to say the least, sketchy and given the Government's all-round record of running large-scale operations, not terribly encouraging. As a small business owner, approaching 60 years of age, I am personally very afraid of what the proposals now apparently being considered will actually mean in practice.

For example, Section 141(a) sets up "an independent agency in the executive branch of the Government, a Health Choices Administration". What exactly does that mean? Especially when considered in the light of Section 141(b)(1) which states that "The Administration shall be headed by a Health Choices Commissioner (in this division referred to as the ‘‘Commissioner’’) who shall be appointed by the President, by and with the advice and consent of the Senate"

Doesn't this mean that the President is effectively going to be running the Health Care commission without any real oversight or control by the House/Senate?

How about Section 141 (c) ?

"The Commissioner shall collect data for purposes of carrying out the Commissioner’s duties, including for purposes of promoting quality and value, protecting consumers, and addressing disparities in health and health care and may share such data with the Secretary of Health and Human Services"

What "data" is this referring to? With whom else will this data be shared? I assume that the reference is not to the Secretary of Health PERSONALLY - but to the whole department!

I am also curious as to why Section 1301 (Accountable Care Organization Pilot Program), which begins by stating that:  "The Secretary shall conduct a pilot program (in this section referred to as the ‘pilot program’) to test different payment incentive models"

which is an eminently sensible and laudable idea, but it goes on to say (on Page 454 under Sub Section (4)) that:

"There shall be no administrative or judicial review under section 1869, section 1878, or otherwise of—

‘‘(A) the elements, parameters, scope, and duration of the pilot program;

‘‘(B) the selection of qualifying ACOs for the pilot program;

‘‘(C) the establishment of targets, measurement of performance, determinations with respect to whether savings have been achieved and the amount of savings;

‘‘(D) determinations regarding whether, to whom, and in what amounts incentive payments are paid"

Doesn't this mean that there are no requirements for the pilot program to achieve anything at all and no requirement to account for how it spends money paid for "incentives" – whatever that may mean!

Finally I am very interested to hear your interpretation of Section 9511 (HEALTH CARE COMPARATIVE EFFECTIVENESS RESEARCH TRUST FUND) which, under section (B) on Page 824 states that:

"There are hereby appropriated to the Trust Fund the following:

‘‘(1) For fiscal year 2010, $90,000,000.

‘‘(2) For fiscal year 2011, $100,000,000.

‘‘(3) For fiscal year 2012, $110,000,000.

‘‘(4) For each fiscal year beginning with fiscal year 2013—

‘‘(B) subject to subsection (c)(2), amounts determined by the Secretary of Health and Human Services to be equivalent to the fair share per capita amount computed under sub section (c)(1) for the fiscal year multiplied by the average number of individuals entitled to benefits under part A, or enrolled under part B, of title XVIII of the Social Security Act during such fiscal year.

The amounts appropriated under paragraphs (1), (2), (3), and (4)(B) shall be transferred from the Federal Hospital Insurance Trust Fund and from the Federal Supplementary Medical Insurance Trust Fund (established under section 1841 of such Act), and from the Medicare Prescription Drug Account within such Trust Fund, in proportion (as estimated by the Secretary) to the total expenditures during such fiscal year that are made under title XVIII of such Act from the respective trust fund or account.

Sounds to me a lot like "robbing Peter to pay Paul"!

I could go on with reading this nightmarishly incomprehensible document, but then I realized that I have representatives in Congress whose function is to do that on my behalf and to make sensible decisions, based on the views of the majority of their constituents and with the good of their constituents as a whole at heart.

I mean that is the democratic process isn't it?

That IS why you were elected to your high and prestigious office, right?

You do still report to us, your constituents, and rely on our support to keep you in office after the next election, don't you?

So I return to my original questions; Have YOU read the Bill? Do you intend to consult your constituents (if so, when and how)? Are you making your voting decision on anything other than a desire to render a partisan vote along Party Lines?

 

Yours faithfully

Andrew E Kramek

posted by andykr | (Comments Off)
Filed Under:

Using a Memento Pattern to Implement CTRL+Z

The native behavior of Visual FoxPro controls is that as long as a control has focus, pressing CTRL + Z (or the escape key) undoes any change that has been made since it gained focus. However, once the user moves off the control, the ability to undo changes in that control, is lost. All that can be done at that point is to revert all pending changes to all fields in the selected record. In other words it is an all or nothing ‘undo’.

However, if you are working in most other Microsoft Windows Applications then pressing CTRL+Z undoes changes sequentially, in reverse. In other words, the first press of CTRL+Z undoes the last change made, the second undoes the last but one and so on. We can easily mimic this behavior in a VFP Form by implementing a "memento" pattern.

What is a Memento pattern?

The Memento pattern addresses the issue of preserving object state. This should not be confused with storing data (e.g. User preferences) but relates specifically to an object whose settings may be changed during its lifetime, but where we need to be able to restore a previous set of values at any time. Data that must be preserved beyond an object’s lifetime must be written out to some form of persistent storage (e.g. to a database, an XML data file or an INI file). The key point is that a memento does not persist between instances of its originator.

What are the components of the Memento

The memento pattern comprises three components (Figure 1):

  • Originator          This is the object whose state is to be captured. It is responsible for creating the memento, passing a snapshot of itself to the memento object and restoring its state from the snapshot when necessary.
  • Memento           Responsible for storing the internal state of the Originator. The amount of information being stored depends entirely on the originator but the memento must ensure that the information is protected from access by objects other than the originator
  • Caretaker          Responsible for instantiating and holding the memento object. The caretaker has no knowledge of, or stake in, the information being stored by the memento. In addition to creating the memento, it must track which memento belongs to which originator

How do we set about building one

In order to deliver undo functionality to a form, we have to ensure that a memento is created each time a user makes a change in a control. However, the great danger with mementos is that they consume system resources. So in order to keep things manageable we only really want to save changes that have been made when the user leaves each control.

So the first thing we have to do is to ensure that we can compare the exit value from an editable control with the original value that it contained when it received focus. That allows us to create the memento only when the exit value differs from the entry value, in other words, when a change has actually been made. To help with this I added a property named “uOldVal” and the following code to the GotFocus() of each base class that has a value property:

This.uOldVal = This.Value

The next step is to determine whether we need to create a memento or not. One possibility would be to add a flag to each control that defines whether we want to record changes. Then add code to the LostFocus() to compare the original and current values when this flag is set and to initiate the creation of a memento whenever they differ. However, this would make each control the “originator” for the memento and that would make restoring the data difficult since the pattern requires that the caretaker only allow the originator to access its memento. The consequence of this approach would be that in order to undo a change the user would first have to navigate to the control – not an unreasonable situation but not quite what we want if we are trying to implement a CTRL + Z style “undo the last operation”.

So really we want our Form to play the role of “Originator” and, since the introduction of BindEvent() in Visual FoxPro, we have a simple way of handling this without the necessity of making even more changes to the individual classes. Instead we add a property to our root form class (GenForms::xFrmStd) to determine whether the form should support multiple undo levels. This could either be a simple ‘True/False’ flag or, as I prefer, a numeric property named “nUndoSteps” that limits the number of Undo steps that the form allows.

In the form’s setup method we check to see whether nUndoSteps is greater than 0. If so, we use BindEvent() to link the LostFocus() event of each control with a uOldVal property to the form’s custom “SetMemento()” method. The code is very simple, completely generic and so it can go directly in the form's root class:

*** Check to see if we need to handle mementos
IF ThisForm.nUndoSteps > 0
  *** Yes, we do, so instantiate the caretaker object here
  ThisForm.oCareTaker = NEWOBJECT( "xcaretaker", "basectrl.vcx" )
  *** And then we need to register the controls on the form
  ThisForm.RegisterControls( This )
  *** Set the KeyPreview property
  ThisForm.KeyPreview = .T.
  *** And disable menu handling by re-directing CTRL+Z
  ON KEY LABEL CTRL+Z KEYBOARD "{CTRL+F5}"
ENDIF
This code calls the custom RegisterControls() method that is recursive and handles the actual task of binding the controls:

LPARAMETERS toObject
LOCAL loObject
*** If we have a container, drill down
IF INLIST( LOWER( ALLTRIM( toObject.BaseClass ) ), ;
  [form], [pageframe], [page], [container], [grid], [column] )
  FOR EACH loObject IN toObject.Objects FOXOBJECT
    Thisform.RegisterControls( loObject )
  ENDFOR
ELSE
  *** Use BindEvent to setup the Form's SetMemento()
  *** method as the delegate for the Control's LostFocus()
  IF PEMSTATUS( toObject, [uOldVal], 5 ) AND ;
     PEMSTATUS( toObject, [LostFocus], 5 )
     BINDEVENT( toObject, [LostFocus], Thisform, [SetMemento], 1 )
  ENDIF
ENDIF

Note the use of ON KEY LABEL in the Setup code above. This is needed because CTRL + Z is a system shortcut combination and is normally processed by the menu before it reaches the form. So in order to have our form able to intercept the CTRL+Z key combination, we need to re-direct it to a non-system combination (in this case I am using "CTRL+F5") that we can detect with the following code in the KeyPress() event, which calls the form’s custom GetMemento method to retrieve the last memento saved. Notice also that, having trapped the original CTRL+Z keystroke we need to kill it to prevent the system from ever seeing it – hence the NODEFAULT in the keypress handler code:

LPARAMETERS nKeyCode, nShiftAltCtrl
*** Use Ctrl+F10 to handle mementos
IF nShiftAltCtrl = 2 AND nKeyCode = 98 ;
   AND VARTYPE( ThisForm.oCareTaker ) = "O" AND ThisForm.oCareTaker.Count > 0
   *** We are using mementos
   ThisForm.GetMemento()
   *** And eat the keystroke
   NODEFAULT   
ENDIF

So much for the Originator. Next we need to address the CareTaker which, in my example is an instance of a collection class instantiated by the form, and assigned to a form property. The Caretaker could, of course, exist at any level – providing that it is accessible to the Originator. However, since the functionality in this case is form specific, there is really no need for the caretaker to exist outside of the form itself and, by making the property to which it is assigned "protected" we can ensure compliance with the requirement that only the originator can access the mementos. An additional benefit of this approach is, of course, that when the form is released, any mementos that are associated with it are also destroyed.

One other thing that we do have to handle is the limiting number of undo steps. This is done in the root form class SetMemento()  method where the current count is checked and, if the maximum allowed number of levels has been exceeded, the first item is discarded before the new one is added. The code is, yet again, quite straightforward.

LOCAL ARRAY laControls[ 1 ]
LOCAL loControl, lnMaxMems, loMemento, lcKey, lcSource
WITH ThisForm
  *** Get a Reference to the control that delegated its LostFocus to this method
  IF AEVENTS( laControls, 0 ) > 0
    *** Get an object reference to the control
    loControl = laControls[1]
    IF NOT ( ALLTRIM( TRANSFORM( loControl.Value )) == ALLTRIM( TRANSFORM( loControl.uOldVal)))
      *** The control has changed
      lnMaxMems = .nUndoSteps
      IF .oCareTaker.Count = lnMaxMems
        *** Remove the first item in the collection
        .oCareTaker.Remove( 1 )
      ENDIF
      *** Now create the memento
      loMemento = NEWOBJECT( 'empty' )
      lcSource = STRTRAN( SYS(1272, loControl), ThisForm.name, 'ThisForm' )
      ADDPROPERTY( loMemento, 'oSource', lcSource  )
      ADDPROPERTY( loMemento, 'uOldVal', loControl.uOldVal )
      *** And hand it to the caretaker
      .oCareTaker.Add( loMemento )
    ENDIF
  ENDIF
ENDWITH
RETURN

We get a reference to the control that fired the method call using AEVENTS(), and then check the control to see if a change was made. If so, we check the memento count, and if necessary remove the oldest (first) item in the collection. All that is left is to create the memento which, in this example consists of the object hierarchy (as returned by SYS(1272) with the form name replaced with “ThisForm”) and the value that the control held prior to the change being made. Labeling the item in this way simplifies the task of restoring the value in the GetMemento() method, which is the last piece of the code we need.

The GetMemento() method retrieves the last memento from the Caretaker and uses its content to restore the change.

IF VARTYPE( ThisForm.oCareTaker ) = "O"
  lnLastItem = ThisForm.oCareTaker.Count
  *** Get the last item from the collection
  loMemento = ThisForm.oCareTaker.Item( lnLastItem )
  *** And remove it!
  ThisForm.oCareTaker.Remove( lnLastItem )
  *** Now just restore the value to the control
  loControl = EVALUATE( loMemento.oSource )
  loControl.Value = loMemento.uOldVal
ENDIF

The only other issue that must be handled relates to the ‘current record’. Obviously we need to clear out any mementos that may exist upon change of record, fortunately that is easy to do with a collection – just call it’s Remove() method with a value of “–1”! Moreover, since our standard implementation for editable forms is that they are always brought up in “View Only” mode, this is easily handled by explicitly clearing the collection every time the form’s mode changes to VIEW (i.e. on leaving EDIT or ADD).

You will notice that all of this code is contained within the root form class since, if it is not implemented specifically by setting the nUndoSteps property greater than zero in the instance of the form, it does nothing at all. The only other class required is the collection class (xCareTaker), which is simply an unmodified first level subclass of the VFP Collection base class.

The zip file attached to this post includes all the necessary classes, and a sample form (below) using a free table. Run the form, put it into edit mode and make a few changes to control values. Then press CTRL+Z and watch the changes undo themselves…enjoy!

 

posted by andykr | 1 Comments
Filed Under: ,
Attachment(s): Memento.zip

More of my Useful Utilities (at least, I think they are)

Like all developers, whatever their preferred language or environment, I have my own personal set of tools and utilities that I use to make life a little easier for myself. Although not particularly generic, or even clever, I find that these little things really help so I offer some more of them here, as is, for your enjoyment, adoption and modification.

The first is a little routine that I wrote when I was asked to address some issues that were being reported in an application written by someone else. I was given access to the source code and needed to find out where the problems were. So I ran the program from the command window and saw the first issue in a form, but what was the form name? Which file was it defined in, and since the problem was actually in a grid, where was the code?

So "TellMe.prg" was born. It's function is to grab an object reference to whatever control is under the mouse and display all the relevant information about it in a message box with an option to copy the data to the clipboard (so I can Alt+Tab over to an open Word document and paste it in for later reference!). I just assign the program to convenient hot key and away I go (See Figure 1)

To implement it all you need (assuming that the source code is in the path) is a simple

ON KEY LABEL CTRL+T TellMe()

and here is the code:

***********************************************************************
* Program....: TELLME.PRG
* Author.....: Andy Kramek
* Date.......: 31 August 2002
* Notice.....: Copyright (c) 2002 Tightline Computers Ltd, All Rights Reserved
* Compiler...: Visual FoxPro 08.00.0000.1916
* Purpose....: Details of the control under the mouse pointer
***********************************************************************

LOCAL loObj, lcOHchy, lcClass, lcBClass, loParent, lcStr, lcSource, lnSave, lcParLib, lcParent
LOCAL lcCLib, lcFile, lcLoc
#DEFINE CRLF CHR(13) + CHR(10)
*** Get the object reference
loObj    = SYS( 1270 )
IF VARTYPE( loObj ) # "O"
  RETURN
ENDIF
*** Get the associated information from the reference
lcClass  = loObj.Class
lcCLib   = loObj.ClassLibrary
*** SYS(1271) gets us the File name
lcFile   = SYS( 1271, loObj )
lcLoc = SYS(1272, loObj )
IF TYPE( "loObj.Parent" ) = "O" AND NOT ISNULL( loObj.Parent )
  loParent = loObj.Parent
  lcParent = ALLTRIM( loParent.Class ) + "::" + ALLTRIM( loParent.Name )
  lcParLib = ALLTRIM( loParent.ClassLibrary )
ELSE
  lcParent = ""
  lcParLib = ""
ENDIF
lcSource = IIF( PEMSTATUS( loObj, 'controlsource', 5 ), ALLTRIM( loObj.controlsource ), "" )
IF EMPTY( lcSource )
  lcSource = IIF( PEMSTATUS( loObj, 'recordsource', 5 ), ALLTRIM( loObj.recordsource ), "" )
ENDIF
*** Build the string
lcStr = ""
lcStr = lcStr + "Object: " + ALLTRIM( loObj.Name ) + CRLF
lcStr = lcStr + "Class: " + ALLTRIM( lcClass ) + CRLF
lcStr = lcStr + "ClassLib: " + ALLTRIM( lcCLib ) + CRLF
lcStr = lcStr + "Location: " + ALLTRIM( lcLoc ) + CRLF
lcStr = lcStr + IIF( EMPTY( lcSource ), "", "Source: " + ALLTRIM( lcSource ) + CRLF)
lcStr = lcStr + IIF( EMPTY( lcParent ), "", "Parent: " + ALLTRIM( lcParent ) + CRLF)
lcStr = lcStr + IIF( EMPTY( lcParLib ), "", "ClassLib: " + ALLTRIM( lcParLib ))
lcStr = lcStr + IIF( EMPTY( lcFile ), "", "SCX File: " + ALLTRIM( lcFile ))
*** Display it
lnSave = MESSAGEBOX( lcStr, 36, "Save Details to ClipBoard?" )
IF lnSave = 6
  _Cliptext = STRTRAN( lcStr, CRLF, "~", -1, -1, 1)
ENDIF

 

Another very simple little tool that I use quite is one to find a specific occurrence of a string in a file. Now I know we have the excellent code references tool built into the newer versions of VFP, but, apart from the fact that this tool pre-dates those, it is rather different in intent. Most of my VFP work has been concerned with non-visual objects (middle tier components and the data access layer) and I use programmatically defined classed extensively. However, it's tough to manage the prg files – though the Document View was a huge benefit it is totally interactive and doesn't always do what I wanted. So I wrote FindText allow me to search through a prg and find all occurrences of a specific text string and write them out to file with, optionally, the relevant line number (ready for the GoTo function).

So why do I need this? Well, as those of you who have ever seen my code will know I am a strong believer in commenting the code (just look at TellMe.prg above) and a very quick way to document some code is to use FindText because I always prefix my comments with a triple "*". So the following line of code:

findtext( '***', 'findtext.prg', 'findtext.txt', .F., .T. )

creates a file named FindText.txt that contains:

Searching For: '***' In File: findtext.prg
==========================================
**********************************************************************
**********************************************************************
*** Check Parameters - Search String MUST be passed
*** If no file, get one!
*** But check that the file exists (in case)
*** Get the Type of Input File (based on Extension)
*** If we have no Target file, just send output to standard text file named "FindText.Txt"
*** Get search string into local variable too
*** Create the Output file and add a header to it:
*** We have a form or class library
*** Open the file as a table
*** Get all non-deleted method code into a single string
*** Close file and restore work area
*** Just grab the entire file
*** Get data into an Array, one row per line and trim them (third param!)
*** nothing found
*** We have something to search, so go for it
*** Check each line for the search string
*** Dump it to the output file

and here is the code for this one:

 

**********************************************************************
* Program....: FindText
* Compiler...: Visual FoxPro 06.00.8492.00 for Windows
* Copyright..: Andy Kramek and Marcia Akins, Tightline Computers Inc, 2001
* Abstract...: Locate all Occurrences of a specified text string and dump to file
**********************************************************************
LPARAMETERS tcFindString, tcInFile, tcOutFile, tlShowLines, tlSuppressDisp
LOCAL ARRAY laLines[1]
LOCAL lcInFile, lcExt, lcOutFile, lcStr, lnSelect, lnLines, lnCnt
#DEFINE _CRLF    CHR(13)

*** Check Parameters - Search String MUST be passed
IF VARTYPE( tcFindString ) # "C" OR EMPTY( tcFindString )
    ASSERT .F. MESSAGE "Must pass a Character String to FindText()"
    RETURN .F.
ENDIF
*** If no file, get one!
IF VARTYPE( tcInFile ) # "C" OR EMPTY( tcInFile )
    lcInFile = GETFILE( 'PRG;SCX;TXT;VCX', "File", "Open", 0, "Choose File to Search" )
    IF EMPTY( lcInFile)
        ASSERT .F. MESSAGE "Must specify a file to search for FindText()"
        RETURN .F.
    ENDIF
ELSE
    lcInFile = ALLTRIM( tcInFile )
ENDIF
*** But check that the file exists (in case)
IF ! FILE( lcInFile )
    ASSERT .F. MESSAGE "Must pass an available file to search to FindText()"
    RETURN .F.
ENDIF

*** Get the Type of Input File (based on Extension)
lcExt = JUSTEXT( lcInFile )
*** If we have no Target file, just send output to standard text file named "FindText.Txt"
lcOutFile = IIF( VARTYPE( tcOutFile ) # "C" OR EMPTY( tcOutFile ),  "FindText.Txt", ALLTRIM( tcOutFile ))
*** Get search string into local variable too
lcFindStr = ALLTRIM( tcFindString )

*** Create the Output file and add a header to it:
lcStr =  "Searching For: '" + lcFindStr + "' In File: " + lcInFile
STRTOFILE( lcStr + _CRLF , lcOutFile )
STRTOFILE( REPLICATE( "=", LEN( lcStr) ) + _CRLF, lcOutFile, .T. )
STRTOFILE( _CRLF, lcOutFile, .T.)

IF INLIST( lcExt, "SCX", "VCX")
    *** We have a form or class library
    lnSelect = SELECT()
    *** Open the file as a table
    USE ( lcInfile ) IN 0 ALIAS FileToSch
    *** Get all non-deleted method code into a single string
    SELECT FileToSch
    lcStr = ""
    SCAN FOR ! DELETED()
        lcStr = lcStr + ALLTRIM( METHODS )
    ENDSCAN
    *** Close file and restore work area
    USE IN FileToSch
    SELECT (lnSelect)
ELSE
    *** Just grab the entire file
    lcStr = FILETOSTR( lcInFile )
ENDIF

*** Get data into an Array, one row per line and trim them (third param!)
lnLines = 0
lnLines = ALINES( laLines, lcStr, .T.)
IF lnLines < 1
    *** nothing found
    RETURN .F.
ENDIF
*** We have something to search, so go for it
FOR lnCnt = 1 TO lnLines
    *** Check each line for the search string
    IF lcFindStr $ laLines[lnCnt]
        *** Dump it to the output file
        IF tlShowLines
          STRTOFILE( "Line " + TRANSFORM(lnCnt) + ";  " + laLines[lnCnt] + _CRLF, lcOutFile, .T.)
        ELSE
          STRTOFILE( laLines[lnCnt] + _CRLF, lcOutFile, .T.)
        ENDIF
    ENDIF
NEXT

IF ! tlSuppressDisp
  MODI FILE (lcOutFile) NOWAIT
ENDIF

Finally, today, another 'quick and dirty' little dump program. This time for listing out the contents of a VFP database container, either just the table and column names, or the names and the actual column definitions. As with most of my little utilities, this dumps the information to a text file in the current working directory.

********************************************************************
*** Name.....: GETFOXDBC.PRG
*** Author...: Andy Kramek
*** Date.....: 1/1/2005
*** Notice...: Copyright (c) 2005 Tightline Computers, Inc
*** Compiler.: Visual FoxPro 08.00.0000.3117 for Windows
*** Function.: Get a listing of all tables in a DBC and write the fields out to a text file
********************************************************************
LPARAMETERS tcDBC, tlNoStru
LOCAL lcTable, lnCnt, lcOutFile, lcStr, lnFields
CLOSE ALL
*** Define some constants here
#DEFINE CRLF  CHR(13) + CHR(10)
#DEFINE _LINE CRLF + REPLICATE( "=", 60 ) + CRLF
SET ASSERTS ON

*** Create the Output File
lcOutFile = ALLTRIM( tcDBC ) + ".txt"
lcStr = "Table Definitions for " + UPPER(ALLTRIM( tcDBC )) + " database" + _LINE + CRLF
STRTOFILE( lcStr, lcOutFile )

*** Get the tables
lcDbc = FORCEEXT( tcDBC, 'dbc' )
SELECT objectname AS table_name ;
  FROM (lcDBC) ;
WHERE objecttype = 'Table' INTO CURSOR curTables

USE IN (tcDBC)
SELECT curTables
GO TOP
SCAN
  *** Get the table name and open it
  lcTable = ALLTRIM( table_name )
  lcStr = CRLF + PROPER( lcTable ) + CRLF + CRLF
  STRTOFILE( lcStr, lcOutFile, 1 )
  USE (lcTable) IN 0
  SELECT (lcTable)

  *** Get the field Data
  lnFields = AFIELDS( laFields )
  lcStr = ''
  FOR lnCnt = 1 TO lnFields
    *** Field Definition
    lcStr = lcStr + LOWER( laFields[lnCnt,1] )
    IF NOT tlNoStru
      lcStr = lcStr + "   " + laFields[lnCnt,2] + " (" ;
      + PADL( laFields[lnCnt,3], 3 ) ;
      + IIF( EMPTY( laFields[lnCnt,4] ), '', ',' + ALLTRIM( STR( laFields[lnCnt,4] ))) + ' )'
    ENDIF
    lcStr = lcStr + CRLF
  NEXT
 
  *** Write out field list
  STRTOFILE( lcStr, lcOutFile, 1 )

  *** Process the list
  USE IN (lcTable)
  SELECT curTables
ENDSCAN

 
As with the last set of tools I posted, I hope that these will give you some ideas for how you can make them better, so please feel free to improve the code and do let me know what improvements you make.

 

 

posted by andykr | 3 Comments
Filed Under: ,

New Software - Build or Buy? A Personal View

Recently I was asked to review a proposal for the medium term development strategy (covering the next 3-5 years) for a company that was contemplating a major extension of the functionality of their in-house business system. One of the key questions being raised was whether the company should continue its current practice of in-house software development or should attempt to replace its custom software with off-the-shelf packaged product(s). This is clearly a fundamental decision with many ramifications and implications, and is not something that can be handled in a 30 minute meeting on a wet Friday afternoon (not that they were even suggesting such a thing). While the following is only a part of the final result, it does address some of the key points at issue and may be of some interest, and maybe even of some use, to others.

Build or Buy?

This is a fundamental decision that will dramatically affect the organization in all aspects of its operation. Currently the company keeps the responsibility for the design, implementation and maintenance of its business critical software in house by employing both development and support staff. It is likely that, in order to support any major extension of the current suite of software, the requirement for IT staff, and probably for external support (i.e. Consultants/Contractors) even if only in the early stages of development, will increase. In the light of this, the possibility of purchasing packaged software rather than undertaking the development directly is potentially attractive.

However, it should be borne in mind that while it is often true that buying an off-the-shelf package reduces the need for in-house IT development (and even support) it does not remove it completely. Since the software in question is undoubtedly “business critical” the possibility of losing the system and being unable to obtain the necessary support must be taken seriously. At the very least, a full operations staff will be needed to ensure continuity of service and system availability and to manage routine operations connected with system maintenance and security. However, there are other issues that must be taken into consideration before buying a software package even assuming that suitable candidate(s) can be identified.

Functionality

The first, and unquestionably the most important, is that of functionality. This is a major problem, which affects all off-the-shelf software because there is a direct contradiction inherent in all packaged software. Simply put, in order to be attractive to as many potential customers as possible the software must encompass as wide a range of functionality as possible. However, any one client has a particular set of requirements that will probably require only some fraction of the whole and so they may end up paying for functionality that they neither need nor want. Consequently, packaged software falls into three broad categories:

  • First there are the non-specific software packages that we all use daily. This group includes all development tools and applications like Word Processors and Spreadsheets. These are not, in the context of the build or buy debate, "packages"
  • Second there is software that implements a specific set of functionality that is governed by fixed rules. This is definitely  the most successful and widely available packaged software. The differentiation between such packages is largely down to the design of the user interface, the degree of modularity (i.e. how much of the totally available functionality you have to buy) and the ease with which the application can be integrated with other software. For example, all accounting systems have to  be functionally identical (i.e. they must have GL, AP AR and these must function in a way which complies with IRS statutes, accounting practices and the law) but there are nevertheless numerous variations available
  • Finally there is software that is designed specifically to facilitate the operations of a particular industry or market sector (often referred to as ‘vertical market’ software). Such software often originates within a single company or organization and is then ‘generalized’ to a greater or lesser extent and offered to other companies in the same business.

Any packaged software that is to replace a current, customized and in-house built application would fall into the third category. The issue with implementing any such software within an existing organization is the “degree of fit.” Since, by definition, the software has not been designed in house it is absolutely certain that it will not operate in exactly the same way as current application does.

The key question is, therefore, the degree of divergence between the company's methods and procedures and those that underpin the software. A fit of less than 90% should be grounds for immediate rejection; moreover, customization of either functionality or processing should be eschewed at all costs. The reason is simply that all software is built around a set of assumptions, and those underlying assumptions drive all functionality and processing. If the functionality and processing methodology do not obviously fit the organization’s requirements then it is a sure indication that the underlying assumptions are inappropriate. Any attempt to force the software to work in a way for which it was not designed (i.e. customize it!) is doomed to failure.

The general rule when considering off the shelf software is that the only way to successfully implement it is for the organization to adopt the exact functionality and processing that the package provides. The only customization that should be undertaken (apart from cosmetic items, like changing logos and report layouts) should be for totally new functionality.

In The most successful implementations of off-the-shelf as a replacement occur when either:

  • The organization wishes to change its operating process and procedures, but does not want to re-write its entire suite of software in order to do so. In this case the new software is the mechanism for implementing change
  • The organization wishes to extend its existing operations but has neither time, nor resources, to undertake the necessary development work. In this case the new software is the mechanism for introducing new functionality

Notice that in each scenario the packaged software must be implemented 'as-is' and does not attempt to duplicate the current system's implementation.

Change Management

The second issue, only slightly less important than functionality, is that any company is a provider of services whose success in the market place depends, to a greater or lesser extent, on being able to respond quickly and efficiently to changes in requirements. When software is under the direct control of the company, changes can be implemented in whatever fashion, and at whatever pace, best suits the organization's operational needs. In other words, as long as control of the software and development resources are under the direct control of the organization it is not only in control of, but actually drives, change.

As soon as packaged software is involved, this ability is lost totally. Instead of being the driver of change, the organization is reduced to being the recipient of whatever the software vendor deems to be the most appropriate solution for their needs in a timescale that they define. The more widely a given package is used, the more difficult it is for the vendor to implement changes quickly; if only because of the necessity to assess  the potential impact of any change on a variety of different clients.

While this is not necessarily a bad thing, it does represent a loss of control over a function that is often critical to the success of the organization as a whole and the realities and implications of such a loss of control must be fully investigated and accepted. While SLAs and similar contractual agreements can mitigate the risk, the fundamental question remains one of 'What-if…?' and answers should encompass scenarios ranging from the inconvenient (e.g. minor functional issues) to the catastrophic (e.g. the vendor's business collapses overnight).

Cost

The perception of the cost of packaged vertical market software is often that it is significantly cheaper than custom development. In reality, the difference is rarely that significant, although it is undoubtedly true that the apportionment of cost over time will differ significantly. Typically custom software has a high initial cost that declines with time while packaged software typically has a lower initial cost but has static, or even increasing, on-going costs.

This is simply because the development costs of the packaged software are borne by the supplier and amortized over time and across their entire client base. In addition to the initial purchase price there will also be an annual maintenance cost, typically in the 15-20% of initial cost range,  and usually with an annual increment built in. The cost comparison chart for "build" (black) versus "buy" (red) typically looks something like that shown at Figure 1:

The consequence is that the total cost of acquisition and maintenance over the typical software life span of five years is usually comparable and, in some cases, custom software may actually end up being cheaper. 

 

Figure 1: Software build (Black) vs buy(Red) costs over time

However, the cost of the acquisition is only part of the story. Taken over the system life span, the cost of the software itself is typically only 10-20% of the total. Alan MacCormack (Assistant Professor of Business Administration at Harvard Business School) stated in a 2003 paper that:

Where costs do become significant for all types of software is in the level of staffing needed. By staffing, I mean the training, maintenance, support, administration and other personnel costs necessary to run the software package efficiently. These costs can add up to as much as 50% to 70% of a software system's Total Cost of Ownership over its useful life.

Normally these costs are not directly affected by the build or buy decision although if the vendor's consultants have to be used for training (a by no means uncommon condition when acquiring packaged software) there may be some direct impact.

Conclusion

There is no single 'right' answer to this question, each case has to be treated on its merits. However, the more specialized a company's systems are, and the more frequent the requirement for rapid response to change, the more important it is for the company to retain direct control over its software.

Apart from any other consideration, there is always the possibility that, by developing custom software for itself, the organization may find that it has ‘accidentally’ created a new vertical market application that has the potential to become a revenue generator in its own right. However, it should be stressed that this possibility should not, in itself, be grounds for choosing in-house development over a packaged solution – if only because all the reasons given above for not buying packages apply equally to other organizations to whom the company might eventually try to market their custom solution.

 

posted by andykr | 2 Comments
Filed Under:

Programming to Interface – a real story

What constitutes a Public Interface, and where does it come from?

An object’s Public Interface is simply the set of Properties, Events and Methods (PEMs) that it exposes to its environment. Notice that the key word here is ‘exposes’. An object may (and usually does) have many more PEMs defined than actually appear in its public interface. This is because the purpose of the interface is not to define the full functionality of the object, but merely to define how other objects can interact with it. In other words, the interface defines the usage, not the implementation and, by that definition, only PEMs that are intended to be accessed, or manipulated, by other objects are exposed. Figure 1 shows the Interfaces for the Visual FoxPro Application object, and the methods (and some properties) exposed by the Public Interface ("Application").

Of course, the interface is defined by the class from which the object inherits directly, although that class need not, itself, define the entire interface. It may inherit either whole, or partial, interfaces from other classes. The actual mechanism and code by which interfaces are implemented is the responsibility of the developer and varies according to the programming language being used. However, the principles remain the same irrespective of implementation, which is what makes the concept of an interface so important when working with objects.

For example the whole basis of Microsoft’s Component Object Model (COM) rests upon the premise that:

A COM object is one in which access to an object’s data is achieved exclusively through one or more sets of related functions. These function sets are called interfaces, and the functions of an interface are called methods. Further, COM requires that the only way to gain access to the methods of an interface is through a pointer to that interface.

Thus all COM component interfaces (pre-defined or custom) ultimately inherit from a single ‘root’ interface called “IUnknown”. Components that support automation do so by implementing a specific interface, named “IDispatch”, which itself inherits from IUnknown. (Figure 1). This brings us to the real reason that interfaces are so important.

Objects that share a common interface are interchangeable

This is what gives OOP systems such enormous power and flexibility, In an application where functionality is delivered by objects, modifying functionality only requires changing the object, not changing the code. In other words, instead of having to modify existing code (adding the inevitable bugs as we do so) we need only create a new object that provides the required behavior and use it in place of the original. However, this is only achievable as long as both objects share the same public interface and is why rigid adherence to the concept of “programming to interface” is so important

Bridges rely on interfaces

The “Bridge” pattern is probably the most important, and certainly the most fundamental, of all object oriented design patterns. The classic definition of a Bridge is that it decouples an abstraction from its implementation so that the two can vary independently. The implication of this is that no object should ever rely on the internal workings (i.e. the implementation) of another object. As stated above, an interface merely defines how two objects should communicate without specifying how either implements any specific functionality. The only knowledge that one object should have of another is the set of exposed methods, their parameters (if any) and return values (if any).

To see how this works in practice consider that when writing code in forms and classes we naturally want to trap for things going wrong and usually (being considerate developers) we want to tell the user when that happens. The most obvious, and simplest, solution is to include a couple of lines of code in the appropriate method that pops up a window and displays some text. The result will look something like this:

IF NOT <A Function that returns True/False>
  cErrorText = "Check failed to return the correct value" + CHR(13)
  cErrorText = cErrorText + "Press any key to re-enter the value"
  WAIT cErrorText WINDOW
  RETURN False
ENDIF

In our application we may well have dozens of situations where we display a window like this to keep the user posted as to what is going on and this works perfectly well until one of two things happens. Either our users decide that they really hate these pesky little “pop-up windows” and would much prefer a more conventional windows-style message box, or, much worse, we need to deploy the code in an environment that simply doesn’t support “pop-up windows” (maybe as a COM component, or in a web form). We must now go and hunt through our application and find every occurrence of this code and change it to support the new requirement (whatever that may be!).

We don’t know about you, but in our experience the chances of getting such a task right first time (not missing any occurrences and re-coding every one perfectly) are so close to zero as to be indistinguishable from it. Even if we could be fairly confident of doing it, we still have the whole issue of testing it to deal with.

So what has this to do with the Bridge pattern? Well, the reason that we have this problem is because we failed to recognize that we were coupling an abstraction (displaying a message to the user) to its implementation (the “pop-up Window”). Had we done so we might have used a  bridge instead and then we could have avoided the problem entirely. Here’s how the same code would look if we had used implemented it using a bridge pattern:

IF NOT <A Function that returns True/False>
  cErrorText = "Check failed to return the correct value" + CHR(13)
  oHandler = This.oMsgHandler
  oHandler.ShowMessage( cErrorText )
  RETURN False
ENDIF

See the difference? We no longer know, or care, how the message is going to be displayed (so we don’t even need the ‘Press any key’ line because we can assume that it will be added in the message handler if it is required). All that we need to know is where to get a reference to the object which is going to handle the display for us.  It is that source of the reference that is the “bridge”. In this example the object’s “oMsgHandler” property provides the bridge between the code that requires a message and the mechanism for dealing with a message. Now, all that is needed to change the way in which our message is handled is to change the object reference stored in that property. That is something that could even be done at run time depending on the environment in which the parent object has been instantiated. This approach successfully de-couples the abstraction from its implementation and our code is much more re-usable as a result.

Now we can see how important the concept of an interface is to the bridge pattern. The implicit assumption behind the bridge is that all possible handlers will implement the appropriate interface. It is the interface that defines the method name (ShowMessage()) and parameters (a string to be dealt with) that all candidate implementations must follow.

Interfaces and Inheritance

Another example of the importance of interfaces is illustrated by the problem posed by the production of ‘output’ from an application. Such output can take many forms. It may be a printed report, a document sent via e-mail, an XML file sent to a different application, HTML sent to a browser, and so on. Clearly, the type of output required can vary widely between applications, between different parts of the same application and even at the same point in a given application depending upon user actions. This can result is some truly nightmarish code to handle the logic and deliver the correct form of output.

However, if all output is controlled by “output objects” that adhere to the same interface, it actually does not matter which one is used at any time. Each output object might have a single method called ReadData that accepts the required input and one called WriteData to produce the appropriate output. What each method does is irrelevant to the outside world. When a different form of output is required, a different object is selected, but the way in which you call that object does not change.

The flexibility that results from adopting this approach to design is independent of inheritance. Unfortunately, it seems that inheritance is the most over-used element of object-oriented technology even though it is actually the least flexible. This is because the inheritance hierarchy is defined at design time and there is no way to change an object’s pedigree at run time. Furthermore, subclasses inherit all of the characteristics of their parent class and, although it is possible to augment and specialize behavior in the sub-classes with well-planned hook methods, or by overriding inherited behavior, inheritance remains, essentially, a design-time tool.

The technique of selecting among several objects, all of which conform to the same interface, can be thought of as “run time inheritance”. By selecting a different object at runtime, you change the behavior of your application, but is possible only because you have programmed the application to a defined interface rather than a specific implementation. When you know what parameters the object’s exposed methods require, you can just package up whatever that object expects and, like a ball, throw it over the wall to that object. You do not need to know, and should not even care, what the object is going to do once it catches the ball. All you need to know is how to ask that object to throw the ball back when the object is finished with it.

The benefits of consciously programming to interface

So why are we bringing all of this up? Well, like most developers, we thought that we understood this concept. However, we were working on a project for a client that brought the importance of this technique home to us in a very practical sense. What we are going to show you is how you can reduce development time and, in general, keep your programming headaches to a minimum, if you adhere to the principle of programming to interface rather than implementation.

Let us start by telling you about the first meeting we had with the client to begin gathering requirements for their new Life Insurance Quote system. It went something like this:

From there, the transfer of ideas and communication went downhill!

Although our first thought was to create a table called “All-in-one” with one memo field called “everything”, it was clear that this approach would not work. We needed a model. But what information did we actually have? During that initial meeting, we had managed to agree that we had the following requirements:

  • Support multiple user interfaces including browsers
  • Support multiple databases including VFP
  • Interact with Microsoft Office applications; in other words, Automation

The logical design

Now we had to start making some design decisions. The main accounting system used by the client was written in Visual FoxPro and that system would be a major consumer of the output of the application. Since we needed to support multiple front ends as well as multiple back ends, we knew immediately that this was going to be designed as an n-tier application and with Visual FoxPro as one of the main client applications, it made sense to use VFP to build the middle tier components. We recognized that these classes might have to be instantiated directly by a VFP front end but still be capable of providing services for other front ends when compiled into a DLL. Since other applications do not understand VFP cursors, we would also need some sort of formatter object to convert cursors to XML that could be sent to a browser. This formatter object would also need to be able to package data in such a way as to provide data for Microsoft Office applications (we knew we would need to talk to both Excel and Word) in the form of ADO Recordsets.

The fact that we might have to support multiple back ends meant we would also need some sort of “converter” for non-VFP back ends. Furthermore, since the client was unable to give us the details we needed to build a complete data model, we were going to have to be prepared to modify our data structure early and often. In order to minimize the number of changes required as the data model changed, we decided to data-drive everything.

This meant that we were unable to use views because table and field details are hard-coded into the view definition. Instead, we decided to use a set of data classes that were, themselves, data-driven. (A discussion of data classes as an alternative to views is beyond the scope of this article. However, if you are interested in more details, see Chapter 13 in ‘1002 Things You Wanted to Know About Extending Visual FoxPro”, published by Hentzenwerke). It is sufficient for the purposes of this discussion to say that our design decision meant that we would require a data manager object that could “talk” to the various databases using SQL Pass-Through (SPT) and ODBC.

While it might have been sexier to opt for OLEDB, we could not be certain that all of our data sources would have an OLEDB provider (especially since we might have to deal with older, legacy, systems). More importantly, since OLEDB only produces ActiveX Data objects, not VFP cursors, there was little benefit in using it anyway because we were going to be building the components in VFP. We now drew our first (logical) model for the system:

 

The physical design

This model was easy to draw, but harder to build. In order to implement it we started by modeling the various real world entities that we knew would be required in the application. Since this was an Insurance application these included things like ‘Quotes’, ‘Policies’, ‘Investment Funds’ and so on. The idea was that each entity would be responsible for managing its own data set and the generic entity root class included all the necessary functionality to retrieve and save data sets. Code in the concrete classes augmented and specialized this core functionality.

The data sets were defined in local VFP tables, which meant that as the project evolved and the data structures inevitably had to be modified, we didn’t have to change any of our code. All we had to do was modify the metadata which consisted of three tables. One for the entities, one for the actual cursors used in the dataset and a link table to  relate individual cursors to an entity. Of course, the same physical cursor could be used by several different entities, either alone, or in combination with others.

Two further supporting tables were required. The first contained the detailed cursor definitions and provided all the information required for the entity to build the SQL statement which would be used to retrieve data from the database. Note that while it defined the fields, tables and joins required to build the cursor it did not actually specify a data source. The information required to connect to the various back ends was stored in the second supporting table.

So we finally had a working model for our application. The functionality would be contained in the entities which would have all the necessary information to allow them to connect to and communicate with a back end database. Now all we had to do was to define the public interface for the entities.

Defining the Interface

Rather than exposing the individual entities directly to the outside world we implemented a façade pattern using a new class. This class provided the entry point into the application and hid the complexity of dealing with the sub system of individual entities. Named ‘InsLink’ (for Insurance Link Object) it was defined so that could be built as a DLL. Its public interface turned out to be very simple indeed. In fact there were only two exposed methods needed.

The first method was used to retrieve a data set from the back end and return it as an XML data stream – this was GetEntity(). The second accepted an XML data stream and saved the content to the appropriate tables in the database – this was SaveEntity(). It returned the completion status of the save request to the front end. The calling prototype for these methods was identical:

  • GETENTITY( Entity Name, Condition (string), Connection To Use, User Id )
  • SAVENTITY( Entity Name, XML data (string),  Connection To Use, User Id )

So far so good, but what about the Automation aspects?  Excel was to be used to handle the (very complex) calculations involved in generating Quotations, Forecasts and Proposals. The problem was that several different spreadsheets were involved and each had different requirements and returned different things. Our application had to collect the data from the user (via the Front End), package up that data, pass it to Excel and finally retrieve and unpack the results. However, we did not want our application to have intimate detailed knowledge about the inner working of all of these spreadsheets. Further investigation revealed that we also had to interact with Word to produce various documents. For reasons too complex to go into here, these had been set up as bookmarked documents rather than templates. The requirements for the Word automation included:

  • Replacing part of a bookmark
  • Replacing an entire bookmark
  • Selecting one or more of multiple bookmarks
  • Deleting unused bookmarks

So the big question was, how could we reconcile all these requirements? We finally realized that we were worrying about implementation again, not interface! Once we recognized that it was not really our problem, we realized that all we had to do was to define an interface that all automation objects, whether Word Documents or Excel Spreadsheets, could adhere to. The result was the addition of two VBA functions to all documents and spreadsheets:

  • ReadData() to accept an ADO recordset and populate itself
  • Writedata() to return the results of the server operation

The server-specific functionality was provided by these functions and this now meant that the only knowledge required by our application was the details of the interface! A new set of entities were designed to handle the automation requirements – although the basic functionality was very similar to that of the data handling entities. To accommodate these new entities we merely extended the public interface of the InsLink class to include a third exposed method (“CalcEntity”) that was used to initiate an automation process. The calling prototype for this method was, quite deliberately, made very similar to our GetEntity and SaveEntity methods, and it returned whatever the automation server passed back, thus:

uRetVal =  oInsLink.CalcEntity( < Entity Name >,  <XML Data (string) >, < Connection > ).

We had, at last, managed to encapsulate all of our key processes. The second, physical model, was a bit different than original logical model:

 

It took us about eight weeks to build our first working DLL based on this model. Our client wanted to see the DLL in action. So we prepared a little demo for him that would allow insurance agents to log in to the web site and access the appropriate products. The code looked something like this:

oInsLink = CREATEOBJECT( ‘Inslink.xInsLink’ )
oInslink.GetEntity( 'vflogin', 'agent01, agent01', 'products', 29 )

It returned an XML string that contained a list of products that agent01 had access to and displayed them on the web page, like this:

 

Needless to say, the client was not overly impressed! They needed something that could be demonstrated to potential investors, users and other interested parties. But to build the full web interface was going to take a significant amount of time anyway, and it was needed “RIGHT NOW! 

Does this sound familiar? Even if the web forms could have been ready much quicker than anticipated (which was not our problem fortunately), the requirement for something that was immediately demonstrable was still a problem. By this time though, we had gleaned enough information so that we actually had some idea of what this application was meant to do! So we finally did what we would have done in the first place if the clients had been able to be more specific about their business requirements: we built a working prototype of the entire application that could be run on a stand-alone laptop machine.

The Prototype

Now, here is the moral of the story and the point of this article. We already had a functioning DLL. Because we had rigorously programmed to interface rather than implementation, all we did to create a working prototype was to create a form class that had the same public interface as our DLL class. We simply replaced the DLL with our new form class and built a working VFP prototype in less than a week.

Compare Figure 5 with the physical design (Figure 3) and you will see exactly what we did. No change in functionality was required in order to convert our DLL to a working VFP prototype. All we had to do was swap one class (Inslink) for another (the Inslink Form class). Everything continued to work because the two classes shared the same public interface.

 

Conclusion

The project which we have outlined in this paper really happened, just as described. It brought home to us, in a very specific and real fashion not only the importance of programming to interface, but also just how much time and effort it really can save you.

 

 

posted by andykr | 1 Comments
Filed Under: ,

Our DevTeach Experience Follow-Up

I have one more point to add in respect of our Vancouver DevTeach experience in June of this year. If you read my blog last week you will know that I had a complaint about the lack of conference materials in general and specifically that:

There was also no conference CD (though I did get an Email telling me that session materials would be downloadable from 6/18)

Well, I just downloaded the session materials for the sessions I wanted and I am seriously angry - for several reasons: 

  • There is no composite download. You have to go to the web site and download each sessions's materials individually.  There are not even files by track - only by session. What a waste of time!
  • Even the session file(s) are not consistent - some have 1 file, some have a dozen. Why couldn't the presenters have zipped their material into 1 file per session?
  • On some sessions, when you try to download the file you get a "Page Not Found" error, other sessions simply have a "No materials available" message
  • When you do get something it is, by and large, useless. The vast majority of "Session Materials" turn out to be just the PowerPoint Slides! The few that have anything more are simply unannotated demo files
  • I spent more than an hour painstakingly downloading over 50 files, one a time, in the hope of finally getting some value - but all I have gotten are a bunch of PowerPoint slides and the odd bit of sample code

There are NO white papers! Not one! Without a white paper a session is pretty much a waste of time.

What I mean is that at the conference you attend 15+ sessions in three days - how can anyone possibly remember all the details? You NEED the white papers so that later, when you come back to review the material, or look up the details of some vaguely remembered speaker's comment, you have some chance of finding it!

In all my years as a conference speaker I have never given a session that did not include a background paper that could be used by attendees to review the session contents and give meaning and context in retrospect. In fact I have never heard of a conference where the submission of a white paper was not part of the requirements for speakers (and I even have known cases where failure to submit a paper meant you didn't speak at all and that you were dropped from the list!)

How can a conference organization be so appallingly dismissive of their attendees?

I really feel we were ripped off from A-Z by this conference and I have already said that we would probably not attend another DevTeach - now I am sure we won't.

In fact, nothing on earth would persuade me to attend another DevTeach and unless you especially like to waste money, I strongly urge you to try another conference where, perhaps, your money and time are appreciated sufficiently by the organizers that they will make a token effort to ensure that you get reasonable value.

The one thing this has experience has taught me is that DevTeach does NOT deliver value by any reasonable measure.

posted by andykr | 2 Comments
Filed Under: ,

Creating Data Driven Pop-Up menus in VFP

One of the little tools that Marcia and I use all the time when working in VFP is a pop-up menu generator that allows us to select a development environment easily and quickly. Yes, we know all about the Task Pane but personally I have always found that a real pain (pardon the pun) to use. All we wanted was a quick and easy way to be able to switch between my various environments.

It seemed to us that the easiest way to do that is to have a pop-up menu that could be invoked from a hot key. Then we realized that there are all sorts of situations in which it would be useful to have a pop-up menu (like right-click options on forms and controls) and so we came up with a simple menu generator to build a pop-up menu on the fly from simple tables. In order to make this generic we set up a relational structure using three tables as shown in Table 1.

These tables are all contained in their own DBC named (imaginateively) "PopMenu". So how do we use the tables?

Well, first we create the records in the names table, currently I have two menus defined, one named "Projects" and another named "EditOptions". The first is the one that I run when I want to select a project environment, and the second is a generic editing options popup. Of course there is no real limit to the number of menus you can define but for the purposes of this article let's stick to these two.

Now we need to define our menu bars for these options and this is really simple. The first thing we need is a horizontal menu divider (grouped options always look better) so that gets defined as the first bar #1 as follows:

cBarText     = "\-"
cBarDesc     = "Divider Line"

Nothing else is needed for this entry. Now we need a bar for each option that we want to include, but for these bars we also need to include the action (and Skip For commands if needed). For my default development environment I use the following:

cBarText     = "\-"
cBarDesc     = "Divider Line"
mBarAction  = " CLOSE ALL
                       CLEAR
                       SET PATH TO (HOME() + ";D:\VFP90\RUN\;LIBS;FORMS;DATA;PROGS;UTILS;BMPS" )
                       SET DEFAULT TO D:\VFP90\RUN\"

And for the "Paste" option on the EditOptions menu we need to include a SKIP FOR condition too:

cBarText     = "Paste"
cBarDesc     = "Paste From Clipboard"
mBarAction  = "SYS(1500, '_MED_PASTE', '_MEDIT')"
mSkipFor     = "EMPTY( _ClipText )"

 

The code to generate the tables is in the attached zip file as GenBars.prg. You can use this file as the template to set up your own version of this little utility, and Figure 1 shows how the data looks for the result:

 

Now all we need is the code to use this. We created this as a class that does all the work in its INIT() event so that it runs itself when called. By returning .F. on completion of the code we prevent the object from actually being instantiated so that there is no lasting impact on the system at run time. The class is based on the session base class and this means that it creates a transient data session of its own – again, to avoid any environmental impact.

However, this also means that the class cannot be defined visually and so, if you really wanted to have this object defined as a visual class you could have to use either a Toolbar, Form or FormSet – these being the three visual classes that can create a Private Datasession. (Note: Of the three, the formset is actually the one with the smallest memory footprint – so this may be the only time you might actually use a formset in VFP. Personally I don't care if the definition is visual or not and so I just use the Session base class).

The Init() method is very simple indeed, as follows:

PROCEDURE INIT( tcMenuName )
  LOCAL lcScript
  *** Have we got this menu definition
  IF This.GetMenuDef( tcMenuName )
    lcScript = This.BuildMenu( tcMenuName )
    EXECSCRIPT( lcScript )
  ENDIF
  RETURN .F.
ENDPROC

If the passed in menu name is found by GetMenuDef(), the BuildMenu() method is called. This creates a temporary MPR file using the data from the metadata tables and the file is then executed to display the menu. O)n completion of the menu action, the method returns false preventing the object from actually instantiating.

The GetMenuDef() method simply executes a SQL query, creating a cursor that contains the relevant data to generate the required menu:

LOCAL lcMenuName
lcMenuName = UPPER( ALLTRIM( tcMenuName ))
*** Populate the cursor
SELECT PB.cbartext, PB.mbaraction, PB.mbarskip, PL.ilnkseq ;
   FROM popnames PN, popbars PB, poplink PL ;
 WHERE PB.ibarpk = PL.ilnkbarfk ;
      AND PL.ilnknamfk = PN.imenupk ;
      AND UPPER( PN.cmenuname ) = lcMenuName ;
      AND NOT DELETED( 'poplink' ) ;
     INTO CURSOR curMenu ;
   ORDER BY PL.ilnkseq
 *** Did we get anything?
RETURN (_TALLY > 0)

This cursor is then used by the subordinate methods (GetBars() and GetActions()) that are called by the BuildMenu() method. Here is the script generated for the EditOptions pop-up and, as you can see it is a perfectly standard MPR file that uses ExecScript in the ON SELECTION clause to execute whatever action has been defined:

DEFINE POPUP editoptions SHORTCUT RELATIVE FROM MROW(),MCOL()
DEFINE BAR 1 OF editoptions PROMPT [Copy]
DEFINE BAR 2 OF editoptions PROMPT [\-]
DEFINE BAR 3 OF editoptions PROMPT [Paste]SKIP FOR EMPTY( _ClipText )
DEFINE BAR 4 OF editoptions PROMPT [\-]
DEFINE BAR 5 OF editoptions PROMPT [Cut]
ON SELECTION BAR 1 OF editoptions EXECSCRIPT( [SYS(1500, '_MED_COPY', '_MEDIT')])
ON SELECTION BAR 3 OF editoptions EXECSCRIPT( [SYS(1500, '_MED_PASTE', '_MEDIT')])
ON SELECTION BAR 5 OF editoptions EXECSCRIPT( [SYS(1500, '_MED_CUT', '_MEDIT')])
ACTIVATE POPUP editoptions

So how do we use this class? The PopMenu.prg is in my VFP root directory and the class name is "xMenuPop". My VFP startup program, called from config.fpw, includes the following two lines:

*** Set Projects menu Hotkey and Run the Projects menu
ON KEY LABEL CTRL+F12 NEWOBJECT( 'xMenuPop', 'D:\vfp90\popmenu.prg', NULL, 'Projects' )
NEWOBJECT( 'xMenuPop', 'D:\vfp90\popmenu.prg', NULL, 'startup' )

The first assigns the projects menu to my CTRL+F12 key, and the second, which is the last line in startup program,  runs it immediately. Similarly to use the EditOptions menu we simply add one line of code to the right-click of any control that we want to invoke the menu from:

NEWOBJECT( 'xMenuPop', 'D:\vfp90\popmenu.prg', NULL, 'EditOptions' )

The class definition, the DBC, tables and the sample data generation program are all included in the downloadable zip file attached to this blog, I hope you find it as useful as we do.
posted by andykr | 0 Comments
Filed Under:
Attachment(s): PopMenu.zip

Our Vancouver Devteach Experience

You may have noticed the blog has been quiet for a couple of weekends. This is because Marcia and I have been out of town to attend the DevTeach conference in Vancouver. As regular readers of this blog will know, I have for many years extolled the virtues of conference attendance and encouraged people to ‘put their money where their mouths are’ and attend conferences.

On occasion it has been suggested that this was an easy position for me to take because I have been, for more than ten years, a speaker at conferences and haven’t actually had to put my hand in my own pocket to attend. The fallacy with that argument is, of course, that preparing a conference session takes many hours of effort – it’s not just a case of throwing a few slides together and dummying up a few code samples, there's a paper to be written to support it all, and many hours of rehearsal review and rework. My own estimate is that a typical 75 minute session requires 80 hours of work, and a 4 hour “pre-con” can take several weeks! So while there may be no direct cash payment, the cost in terms of lost time (and income) can be considerable.

What is the point of this? Well, as you may know Marcia and I decided last year that we were not going to speak at conferences any more and that SW Fox in October 2008 would be the last time we would submit sessions. As it happens, Marcia was not selected for that conference anyway – but either way it was our last as speakers. So this year we decided to attend DevTeach in Vancouver. We might have chosen the Montreal venue last December (it’s a lot closer for us) but we were in England visiting family and so couldn’t make it. Besides, neither of us had ever been to British Columbia and so we decided to add a couple of days up front of the conference and have a little vacation time in Vancouver too.

Why DevTeach? Well, while Marcia does have some existing clients still in VFP, all her new work is in .NET and I have been working almost exclusively in SQL Server for several years now anyway. Since admission to DevTeach (primarily .Net focused) also included admission to SQLTeach (SQL server focused) it was an obvious choice. Not cheap mind you, but obvious. It was also our first non-FoxPro conference for several years. The cash investment alone (fees, flights and hotel), for the two of us, ran close to $5,000.00 – not an inconsiderable sum in these tough economic times.

So, how did we fare? Well, I have to be honest and say that I did not really enjoy the experience. In the sessions that I attended, there were some good speakers who gave solid presentations and imparted a lot of good information. A definite plus there! However there were also some truly appalling speakers who, in my opinion should not have been allowed in front of a paying audience. I am not saying that they were not knowledgeable, or that their topics were uninteresting, but their presentation technique was so bad that it was painful.

No names, no pack drill, but I actually walked out of one session (from the front row where I had been sitting). The speaker began 15 minutes late (he kept waiting [hopefully] for ‘stragglers’), then spent the next 10 minutes on his personal life history and eventually began talking without agenda, structure or plan. After another five minutes I realized that I had no clue what he was talking about and, much worse, felt that he did not know either. What was worse he actually asked me later how I thought he had done!

OK, so not everyone is a natural public speaker in the Jim Booth, Ted Roche or (putting false modesty aside) even Andy Kramek, mode. However, there are simple rules that, when followed, will allow anyone to give a competent and respectable presentation and what I am talking about is people who failed to follow these basic rules. This, to my mind shows a cavalier disregard for their audience who have, after all, paid good money, and also invested their time to come and hear them.

So there were a couple of sessions that I thought were bad. Why am I whining, have I never seen a bad session before? Of course I have (and to be honest, I’ve probably given some in my time). That isn’t why I didn’t enjoy the experience. Taken overall, the technical content (from my SQL-oriented perspective) was at least ‘Good’ and verging on ‘Above Average’. But, as I have so often said, the technical content of the sessions is actually the least important part of any conference.

What is more important is the ability to mingle with like-minded people who share common problems and issues and exchange knowledge and views – i.e. Networking. This is the aspect in which I felt that DevTeach failed miserably. The question is, why? There could be several reasons, and I am not really competent to assess them (after all I wasn’t involved in planning the conference).

The first is that it was unquestionably a very “local” conference. Most of the attendees seemed to be either from the Vancouver area, or to have relations in the Vancouver area. Consequently at the end of sessions they went home. Result – no-one was left in the hotel and there was no real opportunity to network – people were either in sessions, or simply not there.

Second, the hotel layout made it a poor conference venue. The session rooms were split between two floors. Registration, the .Net and related Tracks, the Trade Show and breakfast/coffee between sessions were all on the third floor, while the Keynote, SQL tracks and lunch were on the second. Doesn’t make for a good mixing environment! The only bar was “L” shaped (and quite small) and even in the lobby there was really nowhere to sit and chat (a couple of sofas was about it). Taken all in all it is hard to imagine a less suitable venue for encouraging people to get together in groups.

Third, there was no real conference material. Upon registration we were offered a tote bag and a two page conference schedule and that was the sum total of everything. There was no binder describing the hotel, environs (where should we go for dinner outside the hotel?) and conference facilities (general timings, location of meals, where to look for notices etc) and, initially, not even paper and pens (though they did appear on Day 2 I believe). There was also no conference CD (though I did get an Email telling me that session materials would be downloadable from 6/18 – a week AFTER the conference ended but that the speakers slide decks were available for download immediately). Not really the kind of supporting material I expected or am used to.

Fourth communications in general were weak, in fact, on reflection, they were nonexistent! For example, someone actually complained, in conversation with Marcia on the afternoon of Day 2, that there was ‘not even lunch’  provided. They didn’t realize that lunch was being served on the floor below – but then how would they? There was no information provided at registration, no signs, and no announcements – in sessions or anywhere else - as to what facilities were provided, or where they were located. Worse, there was no indication outside the rooms telling you what sessions were going being held. A simple chart for each room with Topic, Speaker and Time would have been very helpful. I ended up in the wrong room twice and had to do a hurried exit and re-entry when I realized that the topic being presented wasn't what I thought it was going to be. Little things, but…

Fifth, the Keynote. Ah how we love those keynote presentations. In this case the keynote was a demo of Visual Studio Dot_Next. Of course this was of great interest (NOT!) to us SQL Server people. If you are going to combine two conferences, why not have two keynotes? After all the speaker was just one of the regular conference speakers – not an imported celebrity (as at Russ Swall’s “Essential…” conferences), or even an Industry (i.e. Microsoft) expert as at the German Devcon or SW Fox. Couldn't a SQL Server speaker have given a SQL Keynote?

Sixth, out of session activities. This was the last nail in the coffin for me – there were none! Oh yes, the Vancouver IT community hosted a ‘free beer’ party on the Monday night. Unfortunately it was not in the Conference Hotel, nor even close, and unless you were a local (oh, hang on, most attendees were!) it was not an easy task to figure it all out. The only information was in an Email – no announcements, no organized travel (how about sign-up sheets for organizing groups to share Taxi costs….?), no information or directions on how to get there at the conference desk (which was unmanned most of the time anyway). But then this was nothing to do with the conference, and was, obviously, aimed at the local residents.

Apart from this there was nothing. No “mixers”, no “Show and Tell” evening sessions, no “mitt bier” evening sessions, just an empty hotel bar and lobby.  For the first time in more than 10 years attending conferences I was back in my room by 8:00pm every night.

Seventh, there was no passion. I attended 15 sessions over three days and I didn’t see one where I felt that the speaker was really passionate about their topic. For those of you who have ever seen Cathy Poutney, or Doug Hennig, or Jim Booth, Marcia or I (to name but a few) give a session, you will know what I mean when I talk about speakers with passion. The impression I had was that, for the DevTeach speakers, it was ‘just another day at the office’. Most of them made it clear that the sessions were not new and the feeling I got was that it was all a bit of a chore for them.

Finally, one of the things that I have always hated was there, in spades! This conference, more than any other I can remember, suffered from the “Inaccessible Speaker” syndrome. I do not recall, in three days, seeing any speaker outside of a session room – unless they were traveling in packs as when waiting to go out to their speaker dinner.

Trying to go and speak to a speaker is, for most attendees, pretty intimidating. When you have to interrupt a group of speakers who are interacting with each other it is positively frightening. At lunch, what attendee is going to go and sit down at a table of 6 speakers earnestly engaged in high level discussions of (presumably) great weight and import? The problem was compounded at DevTeach because most of the speakers didn’t wear their name tags outside of sessions (so unless you already know them you don’t even know that they are speakers) and so the only the time the speakers were really visible was in sessions.

I know that Marcia and I (and the vast majority of the VFP Speakers, encouraged strongly, and in some cases even required, by VFP Conference Organizers) always made a conscious effort to be approachable, and accessible to attendees. We deliberately tried not to sit at “speaker tables” at lunch, and to ensure that we were always in the bar/lounge after sessions (an easy task for us Smile [:)]). Alas that culture does not seem to have carried over to DevTeach despite its original roots in the VFP community.

My conclusion? Well, as I said, technically the conference was adequate. Most presentations were competent, some were good and only a few were really poor. I did learn stuff; some of it will be directly useful to me, some is potentially useful and some just improved my general background knowledge and understanding. I have no complaints in this respect.

However, when you factor in the ancillary costs (travel time, lost work time, flight, accommodation and meal costs) the return on investment was very poor indeed. So while for a local resident it was probably cheap training, for us it was a very expensive, unstructured, training course with only moderate value. I left knowing no-one whom I did not know before I arrived (though it was certainly nice to see a few familiar faces among the speakers) and with the total number of business cards in my case the same as on the first day.

That was most the disappointing thing. The reality is that the whole conference felt like a commercial training course (where you sit in the classroom with a bunch of people you don’t know and with whom you have no real opportunity for communication and, at the end of the day, you all go your separate ways). From that perspective the training course is actually better value – at least there is a consistent and clear learning program.

Of course, this is just my opinion and others who were there may disagree with me but, as my readers will already know, I always try to tell it like I see it and this is how I saw it. I still believe that conferences can be great value, I just don’t think that DevTeach Vancouver was one of them.

So will we go to another DevTeach? Probably not. To another conference? Probably (right now I am not sure when or where, but I expect that we will) because despite this poor experience I still believe that GOOD conferences amply repay the investment you make in them. 

posted by andykr | 6 Comments
Filed Under: ,

The Cost/Time/Content Triangle

The Cost/Time/Content Triangle is a simple way of representing the rather complex (not to say confusing) interaction between the three key components of any IT project - Cost, Time and Scope.  It provides a mechanism for visualising the effect of changes in any parameter and provides managers with a tool for quickly assessing the impact (and hence the risk) of changes to any one component on the others. 

The Cost/Time/Content Triangle is a graphical triangle whose three sides each represent a measure for an element of the project. By setting an appropriate scale for each side we can define the limits for the three key elements of any project. This is probably easier followed with an example, so let's consider a simple project that, based on the defined scope (the 'Content' element) has been estimated as comprising 4 key parts totaling 20 weeks of work as follows:

  • Main Data Screens             12 Weeks  (55% of the Content)
  • System Admin Functions         3 Weeks  (15% of the Content)
  • Standard Reports               4 Weeks  (25% of the Content)
  • User Definable Configuration   1 Weeks  (  5% of the Content)

The percentages refer to the contribution each part makes to the whole project. Having got the Content, and Time elements we can calculate the Cost using our standard formula which, in this example is very simplistic and based on a standard Development Cost of $25 per man-hour + 25%, giving us a total cost of $25k:

20 * 40 hrs = 800hrs @ $25 = $20,000 + 25% = $25,000

(Note: this is the estimated internal cost of undertaking the project, not the price we intend to charge to the customer; that is a wholly different algorithm Smile [:)])

We can now construct our initial Cost/Time/Content Triangle -  using these values to generate the scales.  The objective is to get a balance so we want to get our initial estimate centered in the triangle. As you will recall from your Geometry classes you find the center of a triangle by drawing a line from each vertex to the mid-point of the opposing side. Where the three lines intersect is the center.   Where each baseline intersects the opposite side of the triangle we set the appropriate estimated value; i.e. Cost = £25k, Time = 20 Weeks and Scope = 100% as shown at Figure 1.

Scope is, of course, the main driving force behind any project and in overall, will largely dictate both the project time scale, and the cost.  After all, the more you have to do, the longer it will take, and the more it will cost!  However, not all elements of a project scope will have equal weight, and invariably the actual scope consists of two parts, the 'Must Be Done list' and the 'WIBNIF (Wouldn't It Be Nice If…) list".  The key to assessing scope is to determine what percentage of the project each element will account for.

Note that, thus far, our Content and Time scales are directly related (10% of the content = 10% of the time) which is what we would expect since our time estimate is based directly on the content. Cost is also directly related to content at this point because it too is derived solely from the content estimate and all we have considered to date is the "must-do" components. In fact, there were (as there always are) a couple WIBNIFs that we estimated as follows:

  • Ad-Hoc Query Screen          2 Weeks
  • Ad-Hoc Report Generator      3 Weeks

Moreover our internal assessment is that both the "User Definable Configuration" and the "Standard Reports" are really WIBNIFs too (the basic project could be done as "Phase 1" with just Screens and Admin functions) . So our project scope could drop as low as 70% of the base estimate, or go as high as 120%. These are, therefore the limits for our "Content" scale, which we define as a linear scale. Similarly our time, based on this content, could be as short as 14 weeks, and go out to as much as 25 weeks and we should add an appropriate time scale that covers this range, with some additional margin at each end (we'll see why later). Draw lines from the Content and Time vertices to each point on the relevant scale - this gives us the CTC Triangle shown in Figure 2:

To complete the triangle we need to add the Cost scale. To get this we need to calculate the cost of the project given various alternate elapsed time scales – remember that we have based the estimate on the defined scope. Changing the scope will give us a different cost for a given time scale. Table 1 shows how this works:

and using this as the basis for our time scale we get the final triangle (Figure 3):

All of our estimates relate to the original estimated scope and a time scale of 20 weeks, so now we can assess the impact of changes by simply reading off the appropriate scale. To ascertain the cost of doing the project in only 15 weeks, we do the following:

  • Draw a line from the Cost/Content vertex to the 15 Week point on the Time axis
  • Draw a line from the Time/Content vertex, through the intersection between our new line and the 100% Scope line and extend it to the cost axis to get the revised cost

The value is about £30K, which equates to £2k per week and is equivalent to 24 weeks at the standard rate - doing things quicker always costs more!

Similarly, we can estimate the reduction in cost if the project can be spread over a longer period. For example if we took 24 weeks for the project it cost only £22.5k.  Although this equates to reducing our weekly rate to about $940, we still only have 20 weeks worth of work to do and (hopefully) would be able to employ our developer gainfully elsewhere to make up the difference.

Now what about those WIBNIFs? If there are no other changes in the original scope, but our client also wants the Report Generator (estimated at 3 weeks work – or an extra 15% on the original scope) we can immediately see our options:

  • To preserve the original cost, the project will have to take 6 weeks longer, reducing our cost to $960 per week, but gaining an additional 3 weeks of time in which to carry out the extra work
  • To deliver on the original time scale, the cost will have to go up to just over $31,000. Not only is this increasing the amount of work to fit in, but it is also increasing our risk of failure
  • Propose a compromise. Increase the project time line by 4 weeks, and the cost to $28,000. This gives the a cost for the report generator of $3000 but gains us an extra week over estimate in which to do the additional work

In precisely the same way we can rapidly asses the effect of scope changes. If the decision is made that User configuration is not required after all, but that an Ad-Hoc Query screen is, the net result is to change the scope by +5%, the estimate can now be revised as either:

  • Deliverable in the same time at an extra cost of £1.5k
  • Delivered for the same cost, 2 weeks later

The triangle is really useful when you need, as in the example above, to cope with changes in scope (of the upward variety) at short notice and need to be sure that you can not only cover the costs, but can explain to the client why it is going to cost so much more. However, as you will no doubt have realized by now, the key to the triangle is the initial setting of the scales. This is only as good as the information which you use to generate your estimates, and the relationship which you determine between cost and time. 

Typically the cost relationship is not as simple as shown in this example – if we really wanted to increase the scope by 15% and deliver in the original time scale our poor developer would be working over 60 hours per week for five months (not a good scenario!). The reality is that we would need additional resources and that will change the cost relationship. Adding a developer is not simply a question of increasing the hourly billing rate, it also increases the  fixed costs; each developer needs equipment, Employment, Administration and Management Overhead costs all increase too. Similarly if the workload falls below 40 hours a week it is not always possible to re-deploy the resource without losing income.

The reality is that the relationship between cost and time is typically a stepped relation, rather than linear. In fact, getting this relationship right is one of the hardest parts of any project estimate, but once you have it, the CTC becomes a powerful tool for assessing risk and managing change.

 

Finding the length of a string in a specific font

One of the things that I often found myself doing when working in VFP was trying to decide how large a textbox had to be in order to accommodate the maximum length of the data that was permitted for its underlying source given a specific font and size. As we all know, unless you stick to non-proportional fonts, the number of pixels required to display a string can vary dramatically depending on the font.

For example, the string "Andy Kramek" actually requires 88 pixels when displayed in Courier New, 10 point font. However, change the font to Arial and you need only 79, unless you make it Bold in which case you need 85. But if you change the font to Verdana then you need 85 for the plain text and 96 for Bold.

Given the multiplicity of fonts available you could be forgiven for thinking that determining exactly how much room to allow would be a fairly normal thing to want to do and therefore could reasonably expect VFP would have native function that would tell you. Unfortunately, it doesn't. Now, at this point you may be thinking that this isn't really very important. After all, VFP sizes text boxes automatically using the underlying field definition when you drag a table field from the data environment, or project manager, on to a form.

Unfortunately the native sizing algorithm is not very accurate! For example, dragging a column defined as VARCHAR(50) to a form creates a textbox 357 pixels wide when the font is defined as Arial 9pt. However, the actual size required for a character string of 50 characters (using the average character width for this font) is only about 250 pixels! Even worse, the sizing has nothing to do with the font defined for the form and, even when you change the form's font in the underlying class, VFP simply carries on sizing controls for its default (Arial 9) font. Now that is a lot of wasted space on a form and, even worse, can be confusing to users because the size of the textbox is not a very reliable guide to the amount of text that it can hold.

We also have to consider the situation in which we need to concatenate data which is stored in individual fields together (to display a ‘full’ name for example, or a ‘City, State and Zip’ line for an address). VFP cannot help us here, and so the normal solution is to decide, empirically, how big to make the textbox. There are actually two issues involved here. The first is to determine the number of characters which we want the textbox to be able to handle and the second is to work out the amount of space that number of characters will need given a specific font setting.

Concatenating Fields

The default behavior of VFP when concatenating fields is to simply add together the defined lengths and create a new field whose width matches the total. Consider a set of name fields – let’s assume ‘FirstName’ is set up as 25 characters and ‘LastName’ as 40. Now VFP will always assume the worst case scenario when concatenating these fields and will generate a result field with a width of 65 characters (just run a SQL select that combines two character columns and you will see that this is indeed the case).

However, this does not represent the reality of the situation. Using the sample data that ships with VFP, we find that the average length of first names is actually 5.85 characters, and for last names it is only 6.83. So really we  could use a textbox capable of showing, say, 25 characters and be confident that we would handle the vast majority of cases. While it is impossible to be prescriptive about this sort of concatenation it is easy enough to set a target size that reflects the typical results.

So the number of characters is defined either by the underlying data source directly, or by some reasonable guess based on the required display.

But what about the font?

While we may be able to determine how many characters we want to display easily enough, as we have already seen, the size of the textbox that we need will depend upon the chosen font. In fact it is not only a question of the size of the font, but also of the style (e.g. Bold or Italic) and the font face. Even different fixed pitch fonts vary in their space requirement for individual characters.

Fortunately VFP does have functions, or more accurately a single, heavily overloaded, function to get this information. FONTMETRIC() can access twenty – yes, that’s right, 20 – different font attributes depending on the input parameter. I find it hard to remember the calling options for even one or two of these and even though IntelliSense makes it a little easier we still need to retrieve individual attributes one at a time. Since each returns the value for a single character, the whole process gets tedious, not to say messy, when you are trying to retrieve several attributes for the same string.

The SizeStr() Function

The solution is to create a simple wrapper function to determine the elements of the sizing that we are interested in. While we could grab all 20 attributes, in practice there are really only  three things we are interested in. The maximum length that the specified string could possibly require, the average and the exact lengths of the specified string. With this information we can make a sensible decision based on how typical our test string is, and how much variance we need to allow for in sizing our textbox. The SizeStr() function displays the results in a little modeless form (Figure 1) and returns the exact length of the specified string in the specified font.

 

Figure 1: The SizeStr() Result Screen (the function actually returns the Exact Width)

The exact length of a string is actually calculated as a two step process. First we use the Visual FoxPro TxtWidth() function to determine the number of “Average Character Equivalents” in the test string when the specified font is taken into account. This calculated value allows us to treat proportional fonts as if they were actually fixed and so simplify the ensuing calculation. Thus, while a string containing 5 letters will always return 5 with a fixed pitch font, a proportional font will give different results depending on both the font itself, and any additional styles that have been defined. Table 1 shows the results from TxtWidth() for the string “This is a Test String”.

Having determined how many average character equivalents we have in our text string we can simply multiply this number by the Average Character size (FontMetric(6)) to get the exact size which is returned by the function. To get the Maximum size for a string containing the specified number of characters we then multiply by the Maximum Character Size (FontMetric(7)). Finally we multiply the Average Character size by the actual number of characters in the test string to determine the average length of the string.

But what about the height?

Unfortunately (again) VFP does not help us to decide how high a textbox should be when we change its font. FontMetric() will give us the actual height of the character in a given font, but we also need to allow for border height when sizing a textbox. In SizeStr() I do this by adding a factor of 28.125% of the character height in the specified font. This value has no theoretical basis, but, by inspection appears to be about right! The resulting value is shown on the display form.

The function is attached to this article and can be downloaded below. It accepts the following parameters:

  • tuInStr        [Required]  The input string to be tested
  • tcFName     [Optional (defaults to "Arial")]  The name of the Font    
  • tnFSize       [Optional (defaults to 9 point)]  The font size (in points)       
  • tcFStyle      [Optional] Font Style codes (Bold, Italic, Underline etc)        

The function pops up the modeless window shown at Figure 1, and when the window is closed returns the exact length of the input string in pixels. However, the parameters displayed in the form are all saved to an object in the function and it is simple to suppress the form and return the object directly.

Example

lnLen = SizeStr( “This is a test string”, “Garamond”, 12, “BI” )
? lnLen       && Returns 131

Hopefully you will find this little function as useful as I do.

posted by andykr | 3 Comments
Filed Under:
Attachment(s): sizestr.zip
More Posts Next page »