Welcome to Foxite.COM Community Weblog Sign in | Join | Help

Key Design Considerations for a Client Server Application

Whilst it is impossible to be prescriptive about how to build and implement a Client/Server application in the absence of detailed requirements and specifications, there are issues that are common to all applications and which need to be handled differently in Client/Server applications than in straightforward desktop applications. In this entry we will look at some key issues. 

User Access Control
In a client server application there are many ways to handle user access, and multiple levels at which it can be implemented:

  • User Interface               At this level we are concerned with two things. First the "Log In" which is the mechanism by which a user establishes their credentials and their right to access the application. Second we may need to implement differential access based upon a user's log in credentials. The actual implementation of both of these must be handled in the presentation tier, although the validation of the user input will have to be handled by the Middle Tier and will normally involve the database too
  • Middle Tier                    At this level the issue is to ensure that any request for data, or to save information to the database, is coming from a valid and authorized source. While less critical (and often omitted entirely) in LAN/Intranet applications, in any application that is exposed to the internet (whether directly or via a web service) this "Request Authentication" is crucial to maintaining application and data security and integrity. This will usually involve the database too
  • Database                       Back end databases (like SQL Server) implement their own multi-layered security model based on user login and define database "roles" that define the actions a user may perform. It typically allows the creation of, and assignment of users to, customized application roles that can be used to set up detailed rules for users

However, when reduced to basics, all of these come down to the same issue; do we recognize this user and  are they allowed to do whatever it is that they are asking?

As noted, back end databases implement complex, layered, security models but while such layered security is useful in some specialized circumstances it is unlikely that, in a Client/Server application it will be used to handle individual users directly. The reason is simply that the management and maintenance of a large population of users on an individual basis is time consuming (and expensive). The question that should, about now, be forming in your mind is how to validate that a user is allowed access to the database if that user does not have a login into the database.

Impersonation

This relies on the creation of two separate databases, the first contains only the tables needed to map user identities to connection information and the second is the application database itself (see Figure 1). The idea here is that the login screen in the presentation tier is directly connected to this database (using a trusted connection so that there is no need to set up individual users in SQL Server). If the user identity is validated, the necessary connection parameters for that user are returned.

Figure 1: User Impersonation model

This information can be as simple as the name of a DSN to use, an explicit connection string or the information for a standard 'user' assigned to a specific role within the database. That information must then be stored directly in the presentation tier and sent as part of each request to the Middle Tier. This, in turn, uses the specified information to connect to the application database and execute the request. This model has a couple of significant benefits:

  • The user has no direct login, and never has access to the information used to connect them to the application database
  • By mapping end users to 'standard' database users full use can be made of the features offered by SQL Server's security model since the middle tier "impersonates" the user on each request.

The main drawback is, of course, that the middle tier has either to keep connecting and disconnecting ( a relatively slow process), or maintain multiple connections to the database. A secondary concern is that the connection information has to be stored in the presentation tier and that, by itself, can be a security risk in some scenarios. However this model is certainly viable and is often used for in-house applications.

Authentication

In this approach the only user that has access to the database is the middle tier component (or, more accurately, the Data Access Layer of the Middle Tier) itself. When a user logs in, the middle tier uses its own connection to the database to validate the user information and generates an 'authentication ID' or 'token' that is returned to the Presentation Tier (Figure 2). Any further requests from this user include the assigned id and, providing it is recognized as valid (they are often time limited), no further authentication is needed. This is very widely used model and offers several benefits:

  • The middle tier is not constantly connecting/disconnecting, nor does it have to manage a connection pool. This also keeps the number of connections to the database to a minimum
  •  The middle tier can be responsible for assigning, and tracking, the authentication IDs. This hides all details of the implementation from the Presentation Tier which has only a single item of data to manage – the assigned authentication ID, usually stored in a "cookie"
  • Access security and control can be handled in code in the middle tier, which can even use locally held data to avoid making round trips to the server

Figure 2: User Authentication Model

The main drawback to this model is that is more complex than the simple impersonation approach outlined above and it places more responsibility on the middle tier. This model is widely used for applications that are exposed over the internet, either directly, or as a web service.

Error Handling

As noted earlier one of the major differences between a Client/Server application and a traditional Desktop application is in the way in which errors must be handled. We can safely assume that the initial detection and generation of an error will be handled by the application responsible for it, whether that is the database,  the Middle tier or the User Interface. So the issue really comes down to how to handle, and record, errors in this multi-tier environment.

The first issue to be decided is whether errors should be logged, and if so, where? The obvious place to log errors would be in the database but, of course, that becomes problematic if the error is such that the database is not available (i.e. a connection issue, or a network problem or even a server crash). One of the benefits of using a data aware middle tier (like Visual FoxPro) is that it does have a local database engine and can easily create local error logs  as local files. But what if it the problem is that the middle tier cannot be reached, or has crashed? 

In that scenario we are left with a User Interface which cannot connect to anything, or do anything other than simply tell the user that it cannot connect and, possibly broadcast some sort of notification – maybe by E-Mail (assuming that the client machine still has access to E-Mail of course). In other words if the primary connection to the middle tier cannot be established, that error has to be handled directly by the client.

In all other cases we really do want the middle tier to be the responsible component. Whether the error is within the middle tier itself, the database or the presentation tier all errors should be handled here. The mechanism for logging the error can be anything that is convenient although my personal preference is to use a local DBF file simply because it is easy to extract the information into any form that may be needed.

Whenever an error is detected, a call is first made to the middle tier's "Log Error" method which handles the task of logging the error. Subsequent action depends upon the nature and location of the error but typically it involves the transmission of a more user-friendly message to the Presentation Tier for display.

Transaction Management

By default back end servers generally implement "Implicit Transactions with Auto-Commit" which means that each SQL Statement executed by a connection is considered a separate transaction. In other words, each INSERT, UPDATE or DELETE command is treated as a separate transaction and either succeeds (and is committed) or fails (and is rolled back).

NOTE: Other commands may require, and initiate, transactions, but for  the purpose of this discussion we will concern ourselves only with the commands that actually modify data. Data Definition Language statements are rarely, if ever, used in a Client Server application environment.

Whether it is ever necessary to change this behavior will, of course, depend on the requirements of the application. As long as there is never any need to wrap multiple update commands in a single transaction the default setting will work just fine. However, when inserting records into a more complex relational model - where Parent-Child relationships must be maintained may require a departure from the basic one command = one transaction approach.

To handle more complex situations, the auto-commit on the connection is disabled so that transactions are only initiated by the server if a one is not already open, or when an explict "BEGIN TRANSACTION" command is issued. Open transactions are not closed by the server, and an explicit COMMIT or ROLLBACK command must be initiated by the application in order to close the transaction(s) that it has opened. Clearly this requires more complex programming, and moves the responsibility for transaction management from the database to the application. Hopwever, in situations where handling multi-table updates as a single unit of work there is no alternative.

Different servers implement transactions in slightly different ways and if you find that you need to use anything other than "Implicit with Auto-Commit" you will need to determine what options are available in the server that you are using and how to best utilize the functionality provided.

 

posted by andykr | 0 Comments
Filed Under:

Introduction to Client Server Applications

Following my post about Client Server architecture, it seems logical to take a look at how we go about building an application using that architecture. So here goes…

The Challenge

Perhaps the biggest conceptual change that most VFP Developers have to make when contemplating the Client/Server architecture is the requirement for maintaining the complete separation of the various tiers. This is something that, in the Visual FoxPro environment, we do not really pay much attention to – mainly because of the tight integration that VFP provides between the database, the programming language and the GUI. The result is that many, if not most, VFP applications have historically been built as "single tier" (or "monolithic") applications in which everything is compressed into the GUI. In other words, the FORM is responsible for everything including opening and closing data files, navigation, inserting and updating data.

This type of design (or lack of it!) is simply not possible when we have to use a remote database as the primary data store. The reality is that this single decision means that we have to completely re-think the way in which our applications are designed and built.

Defining Responsibilities

As with all design tasks, the first thing to do is to actually define the problem. In this case it is really quite simple – we need to know what responsibilities our application has and then identify where, in a three-tier architecture, they need to be located. The following list is by no means exhaustive but indicates the key responsibilities which can be applied to any application.

  • Connect to the Database             Without a connection to the database the application can do nothing at all. This is closely related to, and bound up with the next responsibility
  • Manage Security                        Security management can exist at many levels. But we need to differentiate between "Access" control (can the user get into the application, or the database) and "Functional" control (can the user see this menu item, or certain data fields or reports)
  • Interact with Users                     Unless this is a pure data processing application we will need some form of interaction with our users. The amount of functionality that is built into the user interface will often be determined by the type of interface that is required. For example, a stateless browser based application generally has a less User Interface capabilities than a form based one (irrespective of the language used to create the forms)
  • Access and Manipulate  Data       Quite apart from the requirements of security we need to consider how data is going to be accessed and manipulated. We cannot simply assume that the 'form' will do it and we certainly do not want to duplicate code for handling basic processes like saving changes, or adding new records.
  • Handle Errors                            By introducing separated tiers we now have the problem that errors can occur in places where we cannot simply pop up a message box! Some form of structured error handling is needed to ensure that errors are 'bubbled' up to the user interface, and possibly also transmitted back to the database for logging/storage
  • Validate Data                             The simplest form of validation is that data saves without error. However, when dealing with a remote data store it is vitally important that data is properly validated before it goes to the back. This needs to be handled either in the UI ("assistive validation") or in the middle tier ("rules validation").

Allocating Responsibilities

The conventional design for a Client/Server application, as we have already seen, involves three tiers. These are typically referred to as the UI (or 'Presentation') Tier,  the Middle ( or 'Rules') Tier and the Data (or 'Database') Tier. Given these tiers it is obvious where certain responsibilities must lie. Managing Application security and handling User Interaction and are obviously UI responsibilities, while the responsibility for connecting to, and manipulating, data obviously belongs in the Middle Tier. The task of actually storing and retrieving  data, together with the management of database security and referential integrity obviously belong to the Data Tier.

However that leaves us with Error Handling and Data Validation, neither of which can be definitively assigned to a single tier because both responsibilities impact multiple tiers. An error can occur in any tier but all user interaction must be handled by the UI Tier – so error information, irrespective of the source, must be displayed by the UI. Conversely, it is usual and customary to log errors when they occur in some form of persistent storage. That will require the involvement of both middle and data tiers and so the error handling system really does have to be spread across all three tiers.

Validation is another example of a responsibility that needs to be handled at different levels within the application. The UI (even the simplest) should handle the primary 'assistive' validation tasks. These include ensuring that all required data is present before allowing a save to be initiated, and enforcing  specific formatting to data elements (like Account Numbers). In other words, doing whatever is necessary to ensure that only valid data elements are sent from the UI to the middle tier.

The application of Business Rules must be handled in the middle tier. The reason is twofold. First, business rules should never be implemented in the database because a database may have to serve multiple applications and a single element of data could require different rules in different applications. Second, because the set of Business Rules applicable to any application is independent of the User Interface used to implement them. The only logical place, therefore, is to implement such rules in the middle tier – hence its alternate name as the "rules" tier.

The database is, of course, going to be responsible for enforcing data integrity rules. Typically these will be done by defining constraints and relationships and applying unique indexes on columns.

It should be clear, by now, that the design of a Client/Server application is radically different from that typically used for file-server applications. It is also considerably more complex and requires more code, which begs the question, is it really worth it?

Why Bother with Client/Server

The main advantages offered by the Client/Server architecture can be summarized into a number of key topics. However, as with anything, there are also disadvantages.

Advantages

Data Security

This is perhaps the main reason why existing applications are converted from file server based to Client/Server. The problem is that file server systems in general, and visual foxpro based applications in particular, are vulnerable at a number of levels. Quite apart from the possibility of unauthorized access (i.e. hacking) DBFs suffer from a number of limitations that, in a controlled and protected environment are manageable but which in a more open environment become serious drawbacks. These include

  • Associated files (CDX, FPT) prone to corruption
  • Data files easily accessed read by external programs/third party tools
  • No persistent transaction logging/recovery
  • No native backup/restore functionality
  • Hard to maintain because many 'admin' operations require exclusive use of tables

Performance/Scalability

While VFP is undoubtedly very quick and has excellent data access capabilities, it is workstation based and that means that all data must be transferred to the workstation. This can become a serious issue not just as data volumes increase, but also as the number of users increases, or on high traffic networks and where connectivity bandwidth is limited. By transmitting only requests and result sets over the network, instead of the raw data a Client/Server system can significantly reduce the amount of traffic.

Moreover, by running a dedicated application to handle the data queries, computing power can be optimized at the server rather than at the client level. This not only reduces financial cost (by reducing the hardware requirements for client machines) but also improves the scalability of the application. Since the server now the primary working machine (rather than the client) capacity can be increased by upgrading the server hardware without the necessity to alter the application code in any way.

Portability

Along with improvements in scalability the Client/Server architecture dramatically improves the portability of an application. The separation into three tiers can effectively remove implementation dependencies from the application. A well built Client/Server system is both database independent (in other words it is possible to change the back end without altering the application code) and user interface independent (the same middle tier can be accessed from a browser, or a VFP Form). This is a major consideration when designing a vertical market application where it is may not be possible to prescribe either hardware or software.

Resource Utilization and System Maintenance

The breaking of an application into separate tiers offers significant benefits in terms of system resource utilization and maintenance. In day-to-day terms, perhaps the most significant benefit is that individual users need not have their own accounts in the database. While this is rarely an issue when there are only a few users in the system, applications that have to be accessed by hundreds, or even thousands, of users pose a real problem from the system administration perspective.

Not only do all these users need to have their own accounts (with user IDs and passwords) but they have to be maintained as the user population changes with time. Plus, of course, if each user were to connect directly to the database the server workload increases because of the need to keep multiple connections alive and to keep polling for requests even when an individual connection is idle.

By routing all user interaction through a middle tier not only is the actual number of connections minimized, but the necessity for maintaining individual user accounts on the database is removed because the only 'user' who actually needs to connect is the middle tier

Disadvantages

The two main disadvantages of Client/Server systems are complexity and cost.

Complexity

Complexity exists at two levels.

  • First there is the inherent complexity in dealing with an application that spans multiple tiers. As indicated above in the discussion of responsibilities, issues like error handling are much more complex because of the requirement to maintain the integrity of each tier's function. The separation into discrete tiers also means that functionality that in conventional VFP applications would be handled as a single task may actually have to be split into multiple operations. For example it is not unusual to see, in a VFP application, a form method that processes all records in a table and reports progress to the user as each record is processed. This would be much more complex to code in a Client/Server environment because of the increased messaging required to communicate between the tiers.
  • The second level of complexity also arises out of the separation of the application into tiers. Building and testing a Client/Server application is more complex than a simple file-server application because of the need to establish and maintain the connections between the tiers (even if they are on the same physical machine, the tiers are logically separated and will always involve at least two software packages, and often more).

Cost

Client/Server systems are likely to be more expensive to build and implement for several reasons.

  • First, there are direct costs for application software, database software and the associated user licenses. Although the increasing capabilities of the "open source" databases like MySQL have done much to mitigate this factor in recent years.
  • Second, there are the hardware costs, typically a Client/Server application will require at least one, and often two servers (the data server and the application server) together with all the associated items (backup devices, disk arrays/mass storage devices, uninterruptible power supplies and network infrastructure).  In mission critical, or high availability, environments backup hardware with mirroring or fail-over capability may also be needed.
  • Third, not only is there more hardware required to support a Client/Server application, because the majority of the processing is handled by the server, but that hardware also has to be much more capable (i.e. more expensive) than a simple file server. You can get away (almost!) with using a realtively low end PC with lots of disk space as a file server, but you certainly can't use that type of machine for a database server.
  • Fourth, the initial development costs will also be greater than for a simple file server based application. This is partly due to the need to build the functionality into separate components, but also due to the greater complexity of those components. The effort required at all stages of the development life cycle, Design, Coding and Testing is significantly greater for a Client/Server application than for an equivalent file server application.
  • Finally, there are personnel costs. Maintaining a DBF based application does not require any great level of specialized knowledge but running a major SQL Server, or other back end, database is a highly specialized task. Database Administrators are skilled specialists and, like all speicialists, are (relatively) expensive

Is it really worth it

The only possible answer is, of course, that it depends on the application. However it is generally true that in any environment that will involve more than a few users, the advantages of the Client/Server architecture far outweigh the  disadvantages – which is precisely why it has been so widely adopted and why there have been no significant changes to the architecture for many years.

 

posted by andykr | 0 Comments
Filed Under:

The VFP MVP Awards for 2008/2009 have been announced

I am very proud that Microsoft have given me an MVP Award in Visual FoxPro for the year 2008/2009. This is the eleventh time in succession that I have been so honored and I am always conscious of how very lucky and privileged I am to be recognized in this way.

Marcia has also been renewed as an MVP - this is her tenth successive award - and she too is very proud of her award.

There are always many more people contending for the award than can possibly receive it especially as the number of FoxPro MVPs is dwindling - this year there are only 28 of us - which makes the award even more flattering. We will, as always, try to live up to the standards demanded of us as MVPs.

 

posted by andykr | 6 Comments
Filed Under:

Introduction to Client Server Architecture

I am often surprised to find that many developers today still do not really understand what is meant by a "Client Server" architecture or what the difference between "tiers" and "layers" is. So I thought I would post the following explanation which, if not universally accepted, has served me well over the past 10 years.

Evolution of Client/Server Systems

Computer system architecture has evolved along with the capabilities of the hardware used to run applications. The simplest (and earliest) of all was the "Mainframe Architecture" in which all operations and functionality are contained within the central (or "host") computer. Users interacted with the host through 'dumb' terminals which transmitted instructions, by capturing keystrokes, to the host and displayed the results of those instructions for the user.  Such applications were typically character based and, despite the relatively large computing power of the mainframe hosts were often relatively slow and cumbersome to use because of the need to transmit every keystroke back to the host.

The introduction and widespread acceptance of the PC, with its own native computing power and graphical user interface made it possible for applications to become more sophisticated and the expansion of networked systems led to the second major type of system architecture, "File Sharing". In this architecture the PC (or "workstation") downloads files from a dedicated "file server" and then runs the application (including data) locally. This works well when the shared usage is low, update contention is low, and the volume of data to be transferred is low. However it rapidly became clear that file sharing choked as networks grew larger, and the applications running on them grew more complex and required ever larger amounts of data to be transmitted back and forth.

The problems associated with handling large, data-centric applications, over file sharing networks led directly to the development of the Client/Server architecture in the early 1980s. In this approach the file server is replaced by a database server (the "Server") which, instead of merely transmitting and saving files to its connected workstations (the  "Clients") receives and actually executes requests for data, returning only the result sets to the client. By providing a query response rather than a total file transfer this architecture significantly decreases network traffic. This allowed for the development of applications in which multiple users could update data through GUI front ends connected to a single shared database.

Typically either Structured Query Language (SQL) or Remote Procedure Calls (RPCs) are used to communicate between the client and server. There are several variants of the basic Client/Server architecture as described below.

The Two Tier Architecture

In a two tier architecture the workload is divided between the server (which hosts the database) and the client (which hosts the User Interface). In reality these are normally located on separate physical machines but there is no absolute requirement for this to be the case. Providing that the tiers are logically separated they can be hosted (e.g. for development and testing) on the same computer (Figure 1).

Figure 1: Basic Two-Tier Architecture

The distribution of application logic and processing in this model was, and is, problematic. If the client is 'smart' and hosts the main application processing then there are issues associated with distributing, installing and maintaining the application because each client needs its own local copy of the software.  If the client is 'dumb' the application logic and processing must be implemented in the database and then becomes totally dependent on the specific DBMS being used. In either scenario, each client must also have a log-in to the database and the necessary rights to carry out whatever functions are required by the application. However, the two tier client/server architecture proved to be a good solution when the user population work is relatively small (up to about 100 concurrent users) but it rapidly proved to have a number of limitations.

·         Performance: As the user population grows, performance begins to deteriorate. This is the direct result of each user having their own connection to the server which means that the server has to keep all these connections live (using "keep-alive" messages) even when no work is being done

·         Security:         Each user must have their own individual access to the database, and be granted whatever rights may be required in order to run the application. Apart from the security issues that this raises, maintaining users rapidly becomes a major task in its own right. This is especially problematic when new features/functionality have to be added to the application and users rights need to be updated

·         Capability:      No matter what type of client is used, much of the data processing has to be located in the database which means that it is totally dependent upon the capabilities, and implementation, provided by the database manufacturer. This can seriously limit application functionality because different databases support different functionality, use different programming languages and even implement such basic tools as triggers differently

·         Portability:      Since the two-tier architecture is so dependent upon the specific database implementation, porting an existing application to a different DBMS becomes a major issue. This is especially apparent in the case of vertical market applications where the choice of DBMS is not determined by the vendor

Having said that, this architecture found a new lease of life in the Internet age. It can work well in a disconnected environment where the UI is essentially dumb (i.e. a browser). However, in many ways this implementation harks back to the original Mainframe Architecture and indeed, a browser based, two-tier application, can (and usually does) suffer from many of the same issues.

The Three Tier Architecture

In an effort to overcome the limitations of the two-tier architecture outlined above, an additional tier was introduced – creating what is now the standard Three-Tier Client/Server model. The purpose of the additional tier (usually referred to as the "middle" or "rules" tier) is to handle application execution and database management. As with the two-tier model, the tiers can either be implemented on different physical machines (Figure 2), or multiple tiers may be co-hosted on a single machine.

 

Figure 2: Basic Three Tier Architecture

By introducing the middle tier, the limitations of the two-tier architecture are largely removed and the result is a much more flexible, and scalable, system. Since clients now connect only to the application server, not directly to the data server, the load of maintaining connections is removed, as is the requirement to implement application logic within the database. The database can now be relegated to its proper role of managing the storage and retrieval of data, while application logic and processing can be handled in whatever application is most appropriate for the task. The development of operating systems to include such features as connection pooling, queuing and distributed transaction processing has enhanced (and simplified) the development of the middle tier.

Notice that, in this model, the application server does not drive the user interface, nor does it actually handle data requests directly. Instead it allows multiple clients to share business logic, computations, and access to the  data retrieval engine that it exposes. This has the major advantage that the client needs less software and no longer need a direct connection to the database, so there is less security to worry about. Consequently applications are more scalable, and support and installation costs are significantly less for a single server than for maintaining applications directly on a desktop client or even a two-tier design.

There are many variants of the basic three-tier model designed to handle different application requirements. These include distributed transaction processing (where multiple DBMS are updated in a single transaction), message based applications (where applications do not communicate in real-time) and cross-platform interoperability (Object Request Broker or "ORB" applications).

The Multi or n-Tier Architecture

With the growth of internet based applications a common enhancement of the basic three-tier client server model has been the addition of extra tiers, such architecture is referred to as 'n-tier' and typically comprises four tiers (Figure 3) where the Web Server is responsible for handling the connection between client browsers and the application server. The benefit is simply that multiple web servers can connect to a single application server, thereby handling more concurrent users.

 

Figure 3: n-Tier Architecture

Tiers vs Layers

These terms are often (regrettably) used interchangeably. However they really are distinct and have definite meanings. The basic difference is that Tiers are physical, while Layers are logical. In other words a tier can theoretically be deployed independently on a dedicated computer, while a layer is a logical separation within a tier (Figure 4). The typical three-tier model described above normally contains at least seven layers, split across the three tiers.

The key thing to remember about a layered architecture is that requests and responses each flow in one direction only and that layers may never be "skipped". Thus in the model shown in figure 4, the only layer that can address Layer "E" (the Data Access Layer) is Layer "D" (the Rules Layer). Similarly layer "C" (the Application Validation Layer) can only respond to requests from Layer "B" (the Error Handling layer) .

 

Figure 4: Tiers are divided into logical Layers

 

posted by andykr | 3 Comments
Filed Under:

SW Fox is Coming

As I am sure most of my readers in the USA are already aware, this year's SW Fox conference will be held in Phoenix, AZ real soon now. The full details of the conference can be found at www.swfox.net and if you are not already booked to attend, I strongly urge you to take a good look at this conference - and not just because I am speaking there Wink [;)]

Here's what they have to say about the conference this year:

The best Visual FoxPro conference in North America is back! Phoenix, Arizona is the place to be in October 2008 as the top VFP gurus on the planet show you how to make the most of VFP.

Geek Gatherings LLC, a partnership of Rick Schummer, Doug Hennig, and Tamar Granor, are organizing Southwest Fox 2008, to be held at the Arizona Golf Resort and Conference Center in Mesa, Arizona from October 16 - 19, 2008.

Check out the speakers and sessions. We think you'll agree that this is the best lineup of both speakers and content of any VFP conference this year.

Read some of the testimonials from previous attendees. As one attendee said, "The experience gained from just one session can easily pay for the cost of the conference."

Gold sponsor xSQL Software is giving away a free license of xSQL Data Compare Pro (a $399 value -- allows you to compare and synchronize data in two SQL Server databases) to every conference participant. What a great offset of the conference price -- you pay $695 for the conference and get back more than half of it ($399) with this offer alone! Even better, you'll receive your free license as soon as you register for Southwest Fox 2008 so you don't even have to wait until October!

Geek Gatherings is also offering eight MSDN Premium Subscriptions, each worth almost $11,000, as door prizes. There is over $100,000 in giveaways in total.

For more information, please see the Southwest Fox Web site, http://www.swfox.net. We look forward to seeing you there!

On a personal note, I have to say that I am really looking forward to SW Fox this year - for a number of reasons. First it is going to be a great conference - with some of the best and most exciting content I have seen in years. Second it is a great location and a lovely time of year to be there.  Third, I hope to meet up with lots of old friends whom I haven't seen for a year.

I should add that this conference will, in all probability, be my last as a speaker. For more than 10 years now I have been lucky enough to have been  asked to speak at least twice a year at conferences. Most years there have been more than two and, in one year as many as five! To be honest, I am tired.

It takes an enormous amount of work to prepare a conference session (my personal estimate is about 50 hours of preparation per hour of speaking time) but as long as I was enthused and felt I had something to say that really didn't matter much. However, I am running out of things to say, and my enthusiasm for the task has waned too - I have other interests outside of FoxPro and they are taking up more of my time these days.

So to all of you who I have met over the years at various conferences I want to say a big THANK YOU for your support and encouragement.

To all the conference organizers I also want to say THANK YOU. Organizing and running a conference is no easy job and it is often a thankless one. Without your efforts and dedication VFP conferences would never have lasted the course and I really hope that you continue to keep up the good work.

 

posted by andykr | 1 Comments
Filed Under:

What is Data Normalization?

Data normalization is a technique that is applied when designing relational database tables in order to minimize the duplication of information and, by so doing, safeguard the database against logical or structural anomalies that could compromise the integrity of the data. By reducing the amount of data duplication, or 'redundancy' normalization achieves four things:

·         Minimize the amount of space required to hold the database   By removing the need to store the same information in more than one place, the physical size of the data is restricted to a minimum
·         Remove the risk of inconsistencies    Since each data element occurs only once, there can be no inconsistency between 'versions' of the same information
·         Minimize possible Update and Delete anomalies   Since each element occurs only once, there is no need to repeat operations in multiple locations when there is the risk that one may succeed while another fails. Additionally by ensuring that all columns in a table are dependent on the primary key, referential integrity is easily enforced
·         Maximize the stability of the data structure   One thing that is certain about any computer application is that the requirements will change. A stable data structure can handle changes to processing requirements without requiring direct modification of the data structure

The technique defines a series of distinct stages, or "Forms", which represent successively greater degrees of sophistication and that are achieved by the application of ever more rigorous rules. The process begins by reducing the data to First Normal Form (1NF) and then continues to the Fifth Normal Form, however, when using surrogate primary keys there is no need to go beyond the Boyce Codd Variation of Third Normal Form (higher levels only apply when the primary key is composed of more than a single column).

First Normal Form

Reduce tables to first normal form (1NF) by removing repeating or multi-valued fields to another, child table.

The obvious question here is what constitutes "repeating or multi-valued" fields. The data requirements for a "Customer Contact List" might include the following information:

·         Account Number
·         Customer Name
·         Primary Contact Name and Telephone Number
·         Alternate Contact Name and Telephone Number

In a flat-file based system we would create a spreadsheet with these items as columns as illustrated:

Table 1: Flat-File Database

Clearly the "contact" and "phone" fields are actually multi-valued fields. The data in each is identical (a name and phone number) and the only difference is in the way in which they are interpreted (i.e. as either the primary or alternate contact). To reduce this data to first normal form we must, according to the rule, remove these columns into their own table and link it to their owning record with a foreign key. An additional key can be used to handle the "interpretation" by linking each contact record to a lookup table that defines the possible uses for the information as illustrated:

Note: The 'crows foot' on the linking line indicates that there are many records in the table for each occurrence of the related key. Thus for each Customer there can be many records in the Contact table.

Figure 1: The contact list after first pass at reduction to 1NF

The benefit of this structure is that we can now add as many contacts as we wish to our customer – we are no longer limited to just 2. However we are not yet finished.

Look closely at the "Contact Table". Although we have now removed the multi-valued fields into their own table, we still have potentially repeating fields. Each record in this table would require a repetition of the contact's name whenever a contact had more than one associated item of information (e.g. Phone, Cell and Fax numbers). So, in accordance with the rule we should now remove contact detail into a separate table (and we can again create a lookup for the definition of the information). The result is the structure shown in Figure 2

The benefits of this structure are clear. We can now handle an unlimited number of contacts per customer – by simply adding new records to the 'Contact' table and linking them to the appropriate customer record. We can also add new types of contact (by adding records to the Contact Type table). Moreover it is now possible to define an unlimited amount of detail per contact.

Remember that one of the key goals of normalization was to create a stable structure that does not need to change when requirements change. So, if we decided we now need to capture mobile phone numbers, or our contact's birthday to send them a card, we could do so by adding a new detail type record, and adding the appropriate information to a new record in the Contact_Detail table. No structural changes to the database are needed!

Figure 2: The contact list in First Normal Form

Notice also that each element of data appears once, and only once, in only one table (Figure 3). Thus the only data in the Customer table is that which relates directly to the customer entity (in this case it happens to be just a name and an account number). Similarly the only data in the Contact table is that which relates directly to the contact entity - in this case it is just a name.

To reconstruct the original 'flat' data we simply join the tables using their related keys. However the most significant benefit is that we can now also access the data in different ways. For example if we wanted a list of all companies for which we hold a fax number we can do it by selecting only those Customer records which have related contacts with a detail record that includes Type = 2 (defined as a Fax Number).

By separating the data into its component parts in this way we also avoid the possibility of error when updating information because we only ever have to update a single record. So, with reference to Figure 3 below, if Alan Anders (Contact_PK = 1) changes his phone number the only record that has to change is the detail table record where contact_fk = 1 and dettype_fk = 1 (phone). 

The same thing applies when deleting information. If Colin Charles (whose Contact_PK  = 5) were to leave "Canny Castings" we simply  delete his record in the "Contact" table, and any records in the "Contact_Detail" table where the "contact_fk = 5" and any application that displays contact information will immediately reflect the change.

Figure 3: The data from the spreadsheet in relational tables

Second Normal Form

Reduce 1NF tables to second normal form (2NF) by removing fields that are not dependent on the whole primary key.

This deals primarily with cases where a surrogate key is not being used, and the primary key consists of more than a single field. Consider the following partial definition for an invoice table:

·         Invoice Number
·         Line Number
·         Customer ID
·         Product ID

In the absence of a surrogate key, the minimum unique identifier for this table is the combination of Invoice and Line Numbers. Now the question is, do each of the other data elements depend on this key? Clearly the Product ID does - since the product ID is defined by each line on a specific invoice. But the Customer ID is not. It depends only on the invoice number – the line number is irrelevant in the context of the customer.

The solution is, therefore, to remove the Customer ID and Invoice ID to a new table, which is the "Invoice Header" and keep the line item information as the "Invoice Detail", as illustrated in Figure 4, where we have also added a surrogate primary key to the new table to avoid using the Invoice Number directly as the primary key.

Figure 4: 2NF resolution of the invoice data

As illustrated this approach immediately solves another, related, issue. How to differentiate and save the information that relates to the invoice as a whole (e.g. order date, due date, total price, delivery and shipping information) from the information related to individual line items (e.g. product, quantity, price, discount, margin).

Third Normal Form

Reduce 2NF tables to third normal form (#NF) by removing fields, other than candidate keys, that depend on other non-key fields

Essentially this rule is saying that all fields in a table must be dependent on the primary key, and only the primary. This is often referred to as meaning 'the key, the whole key and nothing but the key'. Consider the following partial specification for an ORDERS table:

Table 2: Order Table specification

This table actually conforms to 2NF (the Vendor ID, and the Location of the Vendor are both dependent on the order)  but fails 3NF because the Vendor Location is also dependent on the VendorID and therefore should not be included in this table.

Looking at the invoice header table in Figure 4, we can see that it is, in fact, in 3NF. All of the fields identified (except the foreign key to the customer table) depend directly upon the invoice number – which is itself a candidate key since it must be unique within the table.

Boyce-Codd Normal Form

Reduce 3NF table to Boyce-Codd normal form (BCNF) by ensuring that all tables are in 3NF for any feasible choice of candidate key as the primary key

This variant of the 3NF rule is specifically applicable when using surrogate primary keys. Since the surrogate key is, by definition unique within the table, we need to ensure that the table is in fact in 3NF for any other field that could, potentially, be used as a primary key. The Invoice header table in Figure 4 above contains such a candidate key (the invoice number) and the table, as shown, does conform to BCNF since all columns are also dependent directly on the invoice number.

Fourth and Fifth Normal Forms

There are two additional normal forms in database theory but they are concerned only with cases where the primary key for a table is comprised of multiple columns and so, when using a surrogate key, are not applicable. This, in itself, is another reason for using surrogate keys when designing relational tables.

Summary

Normalization is designed to improve the efficiency with which data is stored in a database by reducing data redundancy and eliminating the need for complex mechanisms to manage changes to data. A properly normalized database will be fast and easy to manage and maintain, while a poorly normalized structure will be sluggish and difficult to manage and maintain.

 

posted by andykr | 1 Comments
Filed Under:

What is a Key?

The word "key" has a very specific meaning in relational database design. It refers to a field whose value is shared between tables. The purpose of such data duplication is to enable a record in one table to be associated with a record in another, thereby allowing the extraction of information that is distributed between the tables. Such an association is referred to as a 'relationship' and is what differentiates relational databases from flat-file databases.

There are two types of relational key:
·         PRIMARY A primary key is the column, or combination of columns, whose value uniquely identifies an individual record within a table. Neither the primary key, nor any part of it, may have a value of NULL
·         FOREIGN A foreign key is the column, or combination of columns in one table whose value identifies an individual (i.e. related) record in another table. A foreign key may, under some situations, have a value of NULL indicating that there is no related record

When defining a relationship between two tables, the table containing the primary key is the "Parent" (also known as the "Referenced Table") and the table containing the foreign key is the "Child" (also known as the "Referencing Table"). Thus the foreign key references the primary and thereby defines how the records in the tables are linked.

It is a fundamental rule that all tables must have a primary key. This is obvious if you consider a table that contains the real names of my own immediate family as listed below.

First Name  Last Name
Stanley     Kramek
Elaine      Kramek
Andrew      Kramek
Richard     Kramek
Angela      Kramek

This is fine and shows my father (Stanley) and mother (Elaine) together with my younger brother (Richard) and sister (Angela). This table's primary key is the combination of first and last name - and that is perfectly allowable. However it is not a good design because when my brother married a girl named "Angela" (and yes, he really did) we encounter a problem when we try to add my new sister-in-law to this table:

First Name  Last Name
Stanley     Kramek
Elaine      Kramek
Andrew      Kramek
Richard     Kramek
Angela      Kramek
Angela      Kramek

Now we have no way to differentiate between my sister, and my sister-in-law! This table has no longer has a primary key and in the absence of any additional information is effectively useless. Of course, the obvious solution is to add some additional, differentiating, information perhaps 'middle initial' – although that is not a particularly good choice since not everyone has one, and the value might still not be unique, and as it happens it isn't (my brother's middle name is "Maurice", my sister's is "Mary" and my sister-in-law happens to be "Margaret").

First Name  Last Name   Middle Initial
Stanley     Kramek
Elaine      Kramek
Andrew      Kramek         E
Richard     Kramek         M
Angela      Kramek         M
Angela      Kramek         M

The point of this example is that it is critically important to define a primary key for each table so that every record can be uniquely and unambiguously identified. So far we have defined the primary key as a combination of columns that are used to store the data. This is referred to as a "compound" key.

The first problem with compound keys (as this example shows) is that it is perfectly possible to end up with duplicate values in several columns and so such keys tend to get ever-more complex as data is added to the database.

That raises a second problem for compound keys because any table that references a table that has a compound primary key must carry all of the elements of the key. So in order to link any other information to my 'family' table I would need to duplicate all the information in that table - which makes the task of updating information horrendously difficult.

A much better option would be to use some unique value associated with each individual that can be stored in a single column and, fortunately we have just such a thing available to us – our Social Security Number. SSNs can be stored in a single column and have a value that is unique to the person. This will, finally, allow us to identify each record in the table unambiguously. It is, therefore, a perfectly natural choice for the primary key in any table holding information about people.

In fact Social Security Number is an example of what is called a "Natural" Key (also known as either an "Intelligent" or "Business" Key) and, in this example it will work very nicely giving us a final, usable, version of the table as shown.

First Name  Last Name   Middle Initial   SSN
Stanley     Kramek                       814-56-8975
Elaine      Kramek                       823-22-4578
Andrew      Kramek         E             866-41-8537
Richard     Kramek         M             866-77-5411
Angela      Kramek         M             866-23-8544
Angela      Kramek         M             822-39-6425

But hang on! Are Social Security Numbers truly unique? The answer is 'no'. In fact there are rules that define what constitutes a valid SSN and consequently there is a finite limit on the actual number of truly unique SSNs that can exist – additionally there are certainly cases on record of duplicate SSNs being issued.

This highlights the first of several problems associated with the use of natural keys - are they truly unique? In fact, in order to be usable as a primary key, a candidate must possess three attributes:

·         Uniqueness:      The value must be unique within the table

·         Consistency:      The value must be capable of being validated in terms of the data type to which it is being assigned. In other words, the specified value must be consistent with the rules for the database.  For example, we can be sure that 35/45/2009 is not a valid date, but it is impossible to determine whether "123456789" is an SSN that is missing the dashes that format it, or some other value altogether

·         Verifiabity:        The value of a natural key refers to some real entity and so we have to be sure that the entity actually exists. For example, we know that an SSN is invalid if the first three digits are higher than '770', but we cannot determine whether "123-45-6789" is a 'real' SSN or not

However these are not the biggest issues associated with the use of natural keys as primary keys. The real problem with them is that they are governed by rules that have nothing to do with their use as a primary key, and so are subject to change. Remember, the definition of a primary key is that it must be unique within a table and it may not be null. However, the rules governing SSNs, Vehicle Identification Numbers, Account Numbers, Invoice or Order Numbers, or any of the dozens of other obvious 'natural' keys have nothing whatever to do with 'being unique within a table' which is the fundamental requirement for a primary key.

Does that matter? Emphatically yes! Since the values are governed by rules that relate to their principal function they are subject to restrictions that are irrelevant in the context of a primary key.  For example, a very common requirement for Invoice Numbers is that they should be issued in an unbroken sequence. If an invoice number is also used as a primary key, the system has to ensure that users do not allocate an invoice number that conflicts with that rule, and prevent users from assigning a number and then abandoning the invoice – even though it makes absolutely no difference whether the primary keys form an unbroken sequence or not (so long as they remain unique within the table). Managing the unbroken sequence requirement for a primary key introduces unnecessary complexity (and hence potential for error) into the system.

Change is, of course, the other issue. Since Natural Keys reflect real entities they are liable to change. Primary keys are also stored (as a foreign key) in all related tables, so it is imperative that any changes to the actual value of a primary key must be reflected in all related records. This can have major impacts on a database since it will require that all indexes involving the key be re-built, and may even force physical re-ordering of data within the database. Now, you may be wondering, at this point, why a natural key used as a primary key would ever change. The commonest reason is, of course, 'human error'. Which leads us to the final issue with Natural keys - that of how they are captured in the system.

Natural keys cannot, by definition, be generated by the database. They are also, almost invariably, examples of data that cannot even be derived and so must be input directly by some user (SSN is a good example of this principle; given that the last SSN entered into the system was "876-11-1655" what is the next one going to be? There is, of course, no way to know!).

The issue is that there is a limit to the amount of validation and verification that can be performed and so it is perfectly possible for errors to be entered into the system. It is axiomatic that in the context of data there is no software solution to "Valid, but Wrong". In other words no matter how much validation and checking you do, you cannot detect that the SSN "123-45-6789" should really have been entered as "123-45-6798"!

Of course, it is likely that these errors will, eventually be caught. Then they have to be corrected with all the problems that this raises. In fact, in applications based on natural keys, the routines for handling changes and updates to key values are usually the most complex and difficult code in the entire system. So what is the alternative? Enter the "Surrogate Key"

A surrogate key is simply a column added to the table whose only function is to be the primary key for that table. Typically these columns are simple integers and all major databases provide for the auto-generation of the values. In SQL Server these columns are called "Identity" columns, in VFP "Auto-Incrementing Fields" and other databases use other names but the principles are the same.

All of the issues outlined above in respect of Compound keys, and Natural keys, disappear when a surrogate primary key is introduced. As records are added to a table, the database itself generates the next value for that table and assigns it to the new record. There are no issues because the actual value has no significance whatever – it merely has to be unique within the table and the database itself ensures that this is done properly.

This also means that surrogate keys never need to change. Since the actual value does not matter there is never any reason to change it. Furthermore, because the value is generated as an integer, it requires very little space (4 bytes) and is easily indexed and accessed, making the establishment and maintenance of relationships much more efficient.

In the context of normalization surrogate keys have the added benefit of not requiring analysis any deeper than the Boyce-Codd Normal extension of third normal form. (This is because all further normalization rules apply only to cases where the primary key consists of more than a single column).

In short, there are really no reasons NOT to use a surrogate primary key on all tables in a relational database. In those special cases where data must be generated at different locations and later merged, or when keys must be generated by the client rather than by the server, GUIDs provide a better choice than simple integers as primary keys. But changing the data type does not affect the principle that surrogate keys are the best way to define primary keys.

posted by andykr | 10 Comments
Filed Under:

Access and Assign methods can be useful!

Access and Assign methods were introduced into VFP back inversion 6.0, but for many developers they are still very much an "unknown feature". My objective today is to show a couple of examples of where Access and Assign methods can be very useful. Let's start by reviewing just what Access and Assign methods are, and how to create them.

Basically these methods are event-driven in the sense that, if one is defined for a property, it is fired whenever the associated event occurs. So for an assign method, the event is (not surprisingly) the assignment of a value to the property. Note that it doesn't matter how that assignment is made (i.e. whether you use the STORE command, or simply an "=") the method is fired. Similarly for an Access method the triggering event is any reference to the property. Again the actual method does not matter; the method fires whether the property's value is being read into a variable, or used in an object reference or even being output using the "?".

You can add Access and Assign methods to any custom property that you create, and to most of the VFP native properties through the Edit Property/Method dialog. Exceptions are the Value property of a control and native properties of ActiveX controls (though you can add them to the properties of a VFP OLE Control that hosts an ActiveX control).

To create an access or assign method for a property all you need to do is define a method that is named for the property plus the suffix "_Access" or "_Assign". This is handled automatically in the visual designers in both the "New Property" and "Edit Property/Method" dialogs where you simply check the appropriate box to create the method. In code you simply declare the method in the class definition like any other method. Note that an Assign method MUST define an input parameter to receive the incoming value, while an Access method has no parameters at all.

DEFINE CLASS xObj AS SESSION
  *** Define a new property
  iLastID = 0
  *** Create an Assign Method for the new property
  PROTECTED FUNCTION iLastID_Assign ( tuInval )
    *** Only allows values >= 0
    IF VARTYPE( tuInval ) = "N" AND tuInval >= 0
      This.iLastid = tuInval
    ENDIF
  ENDFUNC
  *** Create an Access Method for the native DataSessionID property
  PROTECTED FUNCTION DataSessionID_Access
    *** Returns the DataSession NOT the datasessionid!
    RETURN This.DataSession
  ENDFUNC
ENDDEFINE

In this simple definition I have defined a custom property that will only ever accept numeric values that are greater than or equal to zero. Trying to set this property to anything else is simply ignored (we could, of course, return an error).  In the case of the DataSessionID property, the access method means that trying to read the current value of the property will return, instead, the current datasession number. The DataSessionID property is, therefore, effectively hidden.

Creating Strongly Typed Properties

One really practical use for an assign method is to create strongly typed properties so that errors cannot occur in your code because a property has an inappropriate value. This can effectively reduce the need to check values repeatedly in code by handling the checking at the point at which a value is assigned. If it fails the test the property rejects the value. The basic methodology is illustrated above, but the code can, and usually is, much more rigorous than indicated there.

One scenario where I use an assign method like this is when storing a record number, or ID value. Obviously I do not want an invalid number in this case so I use an Assign method that forces the value to be an integer in the appropriate range. Here is the code from a property used to store the current user ID value:

PROTECTED FUNCTION iUserID_Assign( tnValue )
  IF VARTYPE( tuInval ) <> 'N' OR EMPTY( tnValue )
    *** Not a number, ignore it!
    ASSERT .F. MESSAGE "User ID must be passed as an integer"
  ELSE
    IF NOT INT( tnValue ) == tnValue
       *** Not an integer, ignore it
       ASSERT .F. MESSAGE "User ID must be passed as an integer"
    ELSE
      IF tnValue < 0
         *** Not 0 or higher, ignore it
         ASSERT .F. MESSAGE "User ID must be passed as positive integer"
      ELSE
        *** We may allow this one – check it
        SELECT userid FROM usertable WHERE userid = tnValue TO SCREEN NOCONSOLE
        IF _TALLY = 1
          *** This is a valid ID
          This.iUserID = tnValue
        ELSE
          *** Ignore it
          ASSERT .F. MESSAGE "User ID must be passed as positive integer"
        ENDIF
      ENDIF
    ENDIF
  ENDIF
ENDFUNC

Notice the use of the "TO SCREEN NOCONSOLE" in the SQL query – this is an old trick that suppresses the output from a query and is useful when we want is to know if a record exists, or how many records meet the criterion – effectively this is executing a SQL Statement of a type that is not normally permitted in VFP:
 _TALLY = SELECT COUNT(*) FROM [source_table] WHERE [condition]

Running Code with SETALL()

Another use for an assign method is to enable code to be run on multiple objects using SetAll(). This method, which exists on all VFP Container classes, is used to set the same property to some specific value on all objects in scope that have that property. If an object does not have the property in question, the instruction is ignored in the context of the object and does not cause an error.

There are some scenarios is which it would be very useful to be able to run some specific code on all objects on a form, or in some container, but only if they actually have the specified method. But SetAll() only applies to properties, not methods, and so to do this we would need to loop through the objects collection and use PEMSTATUS() to determine if each object had the relevant method. If so we would call it and then proceed with the next object.

A much easier option is to simply create a property with an assign method on the object that calls the required method from inside the Assign. After all, there is nothing in the implementation rules that say that an assign method must actually assign a value! So at any time a value is assigned to the property the appropriate method is called. Since we can use SetAll() to set properties we avoid the necessity to loop through the collection and test objects. The code is terribly simple:

PROTECTED FUNCTION iRunCode_Assign( tnValue )
  This.UpdateSelf()
ENDFUNC

So in the form we can simply have:

ThisForm.SetAll( 'iRunCode', .T. )

So any object that has the 'iRunCode' property will immediately call its own 'UpdateSelf()'' or whatever other method is required. You can even create this property and associated assign method as a 'template' method. In other words create the property on your root class but leave the assign method empty. In a specific instance you can then call whatever method you want by simply supplying the necessary code in the Assign method.

The only caveat to this is that you cannot use this approach when sequential execution is needed because there is no way to control, when using SetAll() the order in which objects will execute their code. However, you could still set the property explicitly when you need objects to run their code in a specific order although in that case you might as well call the methods directly.

An additional benefit of using Assign methods to execute code is that the code being called can be hidden from external objects, or even from sub classes, because the property in question is defined at the same level as the method itself and so can execute that code even when the method cannot be called directly.

Creating Objects inside an Access Method

An access method fires whenever its property is accessed so if that property is used to hold an object reference we can use the Access method to determine whether the object exists. If the object is there, we simply return the reference otherwise we can try and create the object on the fly and, if successful, again return the reference. Why bother?

Well it avoids the necessity of testing an object reference property to ensure that the object is there. Here is the code from the access method in a form class that uses our Data Manager object.

PROTECTED FUNCTION oDM_Access
IF VARTYPE( This.oDM ) = "O"
  RETURN This.oDM
ELSE
  IF PEMSTATUS( _Screen, 'oDM', 5 ) AND VARTYPE( _Screen.oDM ) = 'O'
    *** There is an object reference out there already, grab it
    This.oDm = _Screen.oDM
  ELSE
    *** Add the property (no error if already there...)
    ADDPROPERTY( _Screen, 'oDM', NULL )
    *** And instantiate the Data Manager
    SET PROCEDURE TO dataclass, dsetbase, datamgr ADDITIVE
    _Screen.oDM = CREATEOBJECT( 'xDatMgr', This.cDSNToUse )
    IF VARTYPE( _Screen.oDM ) = "O"
      This.oDM = _Screen.oDM
    ELSE
      This.oDM = NULL
    ENDIF
  ENDIF
ENDIF
RETURN This.oDM

Our general practice when working is to use a property on the VFP _Screen object to hold the object reference to the Data Manager object. So this code first checks to see if the object exists – if it does it simply grabs the reference and populates the property with it. If there is no Data Manager, this code adds the property (even if the property already exists this will not cause an error). It then instantiates the data manager object and initializes it by using its own connection information (the cDSNToUSe property holds this). If the creation succeeds the object reference is returned otherwise a NULL is returned.

By encapsulating the process of instantiating the object like this, we can include it in the root class of our form and forget about it. Any form based on this class will now either pick up the existing object reference, or create a new one for all subsequent forms to use, without the need of any specific code in the application start-up.

Returning Other Values when accessing a property

Another use for Access methods is to return something other than the actual value held in a property.  One scenario where you may want to do this is when a property is used to store an ID value but what you actually need is the look-up of that value. Obviously this can be done in code, but if you frequently need it then it makes sense to handle it in an access method like this:

PROTECTED FUNCTION nKeyCode_Access
LOCAL lcRetVal
  IF SEEK( This.Value, 'KeyTable', 'KeyCode' )
    lcRetVal = KeyTable.KeyDesc
  ELSE
    lcRetVal = ''
  ENDIF
  RETURN lcRetVal
ENDFUNC

By extension the same methodology could be used to return an object with multiple values – for example an entire record from a table created using SCATTER NAME. Code similar to that above would simply return a data object, or a NULL:

PROTECTED FUNCTION cDataKey_Access
LOCAL loRetVal
  IF SEEK( This.Value, 'KeyTable', 'KeyCode' )
    lnSel = SELECT()
    SELECT keytable
    SCATTER NAME loRetVal
    SELECT (lnSel)
  ELSE
    loRetVal = NULL
  ENDIF
  RETURN loRetVal
ENDFUNC

There are many more cases where Access and Assign methods can be used to simplify your code and I would be interested to hear of any other uses that you have found for these methods.

posted by andykr | 2 Comments
Filed Under:

FoxRockX – Time to Put up, or Shut up

Over the years much has been made about the FoxPro Community and how important it is to the life and survival of the product. However, as I have said many times (most recently in my "So, no more Visual FoxPro. Now what?" blog entry from May of 2007), it has often been noticeable that the "Community" has been long on talk and short on reaching for their wallets. 

Of course, in the past year Microsoft has announced that VFP will not get another version and that future work, if any, on the product will be limited to critical fixes only. Add to that, the fact that participation at conferences has fallen to the point where from having one major international conference and two or three regional conferences per year in the USA there is now just one regional conference dedicated to FoxPro left – SW Fox had over 150 attendees last year (out of how many thousand FoxPro users in the USA alone - where were the rest of them?) See http://www.swfox.net for details of the 2008 conference... 
Subscription to magazines has dropped off to the point that the two major monthly publications dedicated to FoxPro both essentially died on their feet. Participation in the technical on-line forums has dropped significantly over the past 10-12 months.

Into this picture, at this moment in time, steps Rainer Becker.  

Who is Rainer Becker? 
Rainer Becker is the leader of the German Foxpro user group DFPUG since 1993, publishes the small magazine Foxx Professional, runs an interactive Visual FoxPro forum at forum.dfpug.de, a Visual FoxPro document portal at portal.dfpug.de, a Visual FoxPro eNewsletter at newsletter.dfpug.de, an international online shop at shop.dfpug.com and has organized the German Devcon since 1994. He has been a speaker at various Microsoft events as the Dev Days, Fox Teach and Ce BIT - at Ce BIT fair he was at the VFP demo machine 1993-2005. In his spare time he undertakes consulting work for "Wizards Builders" and for "ISYS GmbH" as well as publishing the Visual FoxPro framework "Visual Extend".
A
dditionally he has been a Microsoft MVP for his work in promoting and supporting Visual FoxPro in Europe since 1997 and in 2007 was a recipient of the FoxPro Community Lifetime Achievement Award.

What has he done? 
First he has undertaken the enormous (and very risky) task of  starting a completely new bi-monthly magazine dedicated to Visual FoxPro – which will be available both on-line and in printed version. That is FoxRockX.
Second he has bought up the entire FoxTalk archive and is making it available, on-line to subscribers to the new magazine, at no additional charge!

You can find full details of FoxRockX here: http://www.foxrockx.com/seite.htm
And you can subscribe by going to either:
http://www.hentzenwerke.com (USA/Asia)
http://shopdfpug.com (Europe)

Of course the most important question for many people is, how much will it cost?
O
n-Line Subscription (including access to the archives)     US$ 99.00      ( 75 EUR )
Printed Subscription (includes on-line and archive)           US$ 158.00    ( 109 EUR )

(Note if you are an existing FoxTalk subscriber you can upgrade your existing on-line subscription to the printed version by paying only the difference i.e. US$ 59.00 (34 EUR).