Thursday, July 17, 2008

Legacy Connectivity in a 64 bit world, Part 5 of 5

And now, the final boot in the teeth...

As I detailed in my last post, it was decided to segregate access to Visual FoxPro data by running OPENQUERY statements on a 32 bit SQL Server 2005 instance. I did all of my initial tests on my laptop, using a local SQL Server instance to access the FoxPro files stored on network file shares. Before I deployed this to a server, my initial thinking was that this wouldn't be too bad once we put support for delegation into place.

Aside: For the uninitiated, delegation allows a SQL Server to use the Active Directory credentials (AKA "domain logins") of the currently-connected user to perform tasks. Essentially, you need to create an Active Directory object called a Service Principal Name or "SPN". This is not hard to set up. For me, the bigger chore was getting over the heebie-jeebies caused by the idea of having a service impersonate users.

In our case, one server would be doing remote queries to another SQL Server. By putting this into place, we could keep our linked server configurations simple, without lots of "mapping" of "user on localserver" to "user on remoteserver". This also works for SQL Server credentials (AKA "standard logins" or "sql logins"), as long as the usernames and passwords match on the local and the remote servers. We would just need to create logins and users on the 32 bit server, as we normally would. This would help to document our security situation and should make adding and removing users easy for the DBAs since we use Active Directory groups for everything. The network guys do the work of adding users to the AD groups, so the DBAs have effectively outsourced that function to the network security people.)

Here's the thing: The Visual Fox Pro OLEDB drivers do not work if you are querying them remotely.

And by "remotely", I mean anything that isn't running on the server itself. If I use SSMS from my workstation and try to use AD credentials, the query fails with the message:
OLE DB provider "VFPOLEDB" for linked server "OLEDBFOXPRO" returned message "Invalid path or file name.".

If I reconnect using SQL credentials, the query works. If I use NTLM authentication rather than Kerberos authentication, both of which authentiate my AD credentails, the query fails.

The situation is the same if I set up a linked server and query it instead of going directly to the Fox Pro server. This makes sense because a linked server will just go and query the Fox Pro server on my behalf. If I can't directly run a query with SSMS, the linked server won't be able to run that query either.

If I RDP onto the 32 bit server and use SSMS to run a query, it works fine. It doesn't matter if I either AD or SQL credentials; they both work.

This is very repeatable and fails even if I specify a full path to the file.

This doesn't seem to be affected by what credentials the SQL Server services run under.
If I copy the FoxPro files to the 32 bit server, it also works if I connect with AD credentials. Unfortunately, we need these files on the network file shares.

To sum up: The failure is specific to what kind of credentials I connect with. I believe that this is a bug in the OLEDB drivers.

So, we will have to set up a certain amount of mappings of user credentials in our linked servers. By luck, many of the applications use SQL Server credentials rather than Active Directory credentials, and those seem to work.

Tuesday, July 15, 2008

Legacy Connectivity in a 64 bit world, part 4 of 5

This post continues my five part saga on connecting to legacy data formats. This third data source type (of four types) is Visual FoxPro. I haven't seen very much Fox since 1992, when I was a C programmer and worked with dBase, FoxPro and Clipper for a few months. I thought that everyone left FoxPro and dBase behind in the mid-1990's. Not so.

Visual FoxPro
was dead-ended by Microsoft a while back. Microsoft states that the ODBC drivers are depricated and that the 32 bit OLEDB drivers should be used. I can pretty reliably bring down an instance with the ODBC drivers and I am very interested in using something besides these drivers.

There are 32 bit OLEDB drivers freely available from Microsoft but it looks like there won't ever be any 64 bit drivers.

All of the projects which I found on Sourceforge looked too shaky to consider running "for real", and I'm not in a position to start writing or supporting open-source OLEDB drivers. No third-party has stepped up to the plate, and I think that someone is missing a business opportunity here. I know that xBase is dead but given the amount of tools that used to work with FoxPro, dBase and Clipper files, it is odd to me that no one has ported their stuff to 64 bit and produced an OLEDB driver.

The fourth data source type is Excel and Access files. This is even more flabbergasting than the FoxPro situation. For the uninitiated, Excel and Access support is through the "Jet" drivers, which can also do some FoxPro work. The problem for us is that there does not seem to be a 64 bit version of Jet. The replacement for Jet is called "Ace" and is oriented towards the Office 2007 file formats. That doesn't matter because there is no 64 bit build of Ace, either. I'm sure that there will eventually be a 64 bit build of Ace, but we can't wait until the next version of Office ships to get it.

As a workaround, I could potentially import the data to native SQL tables using DTS and/or SSIS using the 32 bit providers if I force the package to run on the server in 32 bit mode. The issue there is that there is heavy use of OPENQUERY at random times during the day to these external data sources. The data sources are not updated on a schedule, so I can't just schedule a package to load the data once every so often. I can't alter the clients, so having them run some sort of code with sp_start_job, xp_cmdshell, the OLE/COM stored procedures or CLR before looking for their data isn't practical. (No to mention that implementing a solution with sp_start_job would complicate our security issues, that we are trying very hard to avoid enabling the xp_cmdshell or COM stored procedures or that running packages in a CLR procedure seems to be very frowned on by Microsoft.)

The solution that we are working towards is to keep a 32 bit instance around for the explicit task of using the 32 bit drivers. In other words, SQLServer/64 talks to SQLServer/32 which does the dirty work and hands the results back to SQLServer/64. This is suggested on various googlable forums around the internet. Our initial testing shows that this is not any worse than directly access files. Most of our effort here will be due to increased administration requirements and security configuration.

Thursday, July 10, 2008

Legacy Connectivity in a 64 bit world, Part 3 of 5

The first hint of real trouble came when we started looking at the existing DB/2 connectivity.

The second data source type (of four types) is DB/2. Currently, the application uses 32 bit ODBC drivers dating from the late 1990s. They were written by Wall Data, which was acquired by NetManage in 1999. NetManage's web site doesn't seem to acknowledge that these drivers existed or that NetManage supplies any sort of database connectivity software. One hint as to the age of the drivers: The drivers refer to AS/400, which was renamed to "iSeries" about, oh, forever ago.

It turns out that there are a number of 64 bit OLEDB drivers for DB/2 available, including Microsoft's own DB/2 driver, which used to be part of it's "Host Integration Server 2004" product, but is now freely downloadable. IBM's got drivers too.

My initial problem wasn't finding drivers, it was getting the linked server to talk to the DB2 instance. No one involved understood what to supply as a connection string. I found several resources on appropriate connection strings via google. When I configured the linked server with my best guesses, all I got was "invalid parameter" errors, which did not provide any sort of hints as to what the actual problem was.

Eventually, I found that the current IBM DB/2 driver would not talk to the DB/2 instance in question because the instance was so old (the version dates to the mid-1990's) and modern IBM drivers can't handle anything that old. (There was some sort of seismic shift in the communications layers during the intervening years and the new drivers just don't "habla the espanol", so to speak.)

The vendor that controls the DB/2 is not going to upgrade for us; we are only one of their customers. So, the next thing on my hit-list was the Microsoft drivers.

Eventually, I found the correct way to get the Microsoft drivers to connect to the DB/2 instance, which is to use Microsoft's setup program (the "Data Access Tool") and not to try to construct the strings manually. Unfortunately, the drivers won't do anything but connect because Microsoft's drivers look for certain packages to be installed on the DB/2 instance. The instance in question did not have those packages, so I could not even issue SELECT statements. The vendor that controls the DB/2 instance will not install these package. Due to the age of the instance, I'm not sure if these packages would operate properly If I were in their position, I wouldn't install them either.

We don't get very much data from this vendor, and they actually prefer to FTP the data via a secure channel. So, the actual solution will be to move from remote queries to loading text files which are FTP'ed to us.

Tuesday, July 8, 2008

Legacy Connectivity in a 64 bit world, Part 2 of 5

To continue from the last posting, a project that I am associated with needs to deal with four non-SQL Server data sources. I'll start with the easy one first...

The first data source type is text files. We can get the data in these files into SQL Server with BULK INSERT (which seems deprecated in SQL Server 2005, but seems to still be in SQL Server 2008) or OPENQUERY using the bulk option. This is all native to SQL Server, so I'm not counting this as a major issue.

Exporting to text is a problem, but there are only a few very simple examples of that in the codebase. We are trying to avoid use of xp_cmdshell for all of the usual reasons. For most of our issues, we can probably use SSIS or even a custom vbscript in a job. The worst-case scenario is a small CLR procedure.

(Why-o-why doesn't SQL Server have a BULK EXPORT?)

I'd prefer to keep these OPENQUERY statements in place because they are mostly in views and procedures stored in user databases. It's much easier to manage database objects than it is to manage objects like DTS or SSIS packages. Packages are stored at a server level rather than a database level. In other words, if I backup a database and give it to someone else, I have to remember to export the packages and give those as well. The other thing is that it's not feasible to run a package when a client program runs a select statement and return some sort of result set from the package. Not all of our code is nicely bound up into stored procedures (which is hardly uncommon), so there is no way to 'get in there" with some sort of shim and run something before a client expects results.

The data amounts in play here are small, usually no more than a few thousands of rows, so performance should never be a problem. So, to sum up, this isn't that much of a problem. The first real problem will be detailed in the next posting.

Thursday, July 3, 2008

Legacy Connectivity in a 64 bit world, Part 1 of 5

One of my projects involves moving some very old applications running against a 32 bit SQL Server 2000 instance to a 64 bit SQL Server 2005 instance. When I the project was initiated, I thought my largest problems would be:
  • Differences in behavior between 2000 and 2005
  • Porting a couple of hundred DTS packages to SSIS

Wrong. It turns out that my biggest problem is a lack of 64 bit OLEDB drivers.

I have had reason to spend a good deal of time researching the availability of 64 bit solutions for certain legacy data formats. This is a summary of that research and I hope that it saves someone some pain. Anyone could do what I have done here by spending time searching with google. I don't know anything unique, nor do I have any insider knowledge. Still, I was surprised by the situation on the ground.

An early version of this originally appeared on Swynk email lists, specifically the sql2k5 list. It's been edited to tighten up some of my informal writing style. I have also added some new details and information.

(FWIW, I've been reading and occasionally sometimes posting to the Swynk lists since some time during the Clinton administration. I gave up reading usenet postings years ago and I find "web" forums too clunky to use on a regular basis). I think that the only thing that I've been reading longer than swynk is slashdot.)


To begin: In a 64 bit SQL Server environment, you must have 64 bit drivers for OPENQUERY to use. In other words, a 64 bit SQL Server won't run 32 bit OLEDB drivers.

One can get around this problem if the codebase is in DTS. In that scenario, it is possible to force the execution of DTS packages in a 32 bit environment on a 64 bit Windows OS. Another possibility would be to run the packages on a dedicated 32 bit SQL Server. This has performance and security considerations. (Generally, I like to run packages on the destination server.)

Unfortunately, our code base has a lot of OPENQUERY usage. OPENQUERY is used in stored procedures that are executed by user actions throughout a day, as well as by scheduled jobs. Theoretically, all of the OPENQUERY could be rewritten as DTS or SSIS packages and then run by xp_commandshell or via on-demand jobs. I'm leery of the performance and security implications of that, plus that would be an overwhelming amount of work.

So, what to do? Details in the following posts...

Tuesday, July 1, 2008

Powershell functions: whoami and Prompt

Two quick ones for this post, before the US 4th of July holiday. Both functions have one thing in common: use of environment variables. In order to see the environment variables that are available, just type:
dir env:

(This listing could be extensive. (dir env:).Count returns 51 on my laptop.)

Environment variables aren't as popular as they used to be, with settings located in registry entries, .ini files and .xml files. For a few things, it's still easier to find their values through an environment variable than to figure out where they hide in the registry or google for the proper .net calls. When I set up a new machine, I add a path to the environment that points to my local copy of my scripts. This can be very handy since I still have some .vbs and .wsf scripts lurking around.

whoami simply displays the name of the currently logged in user. It's very simple, but it is also a good, simple example of how to use environment variables in a script.

Here is the function:
function whoami {
$env:userdomain + "\" + $env:username
}

I often use $env:userdomain as a default for parameters of functions that look up information about domain users and groups. With most of my clients, the chances are that the user or group that I am interested in will be in my domain.

Prompt functions are another one of those things that everyone writes. Some are sparse, some are colorful and some display so much information that they can be confusing to anyone but the original author. Mine is fairly sparse:

function prompt {
$host.ui.RawUI.WindowTitle = "psh"
$promptText = "[" + $env:username + "@" + $env:COMPUTERNAME + " @" + $(get-date -uformat "%H:%M:%S")
write-host $promptText -NoNewLine
write-host " $(get-location)" -NoNewLine -ForeGroundColor Cyan
write-host "]"
}

I'm on the fence about whether the two-live format is a good idea or not. I include the time as a sort of poor-mans stop watch. It lets me get an idea for a when a command stopped without actually timing it and without getting too fancy in the prompt code.

That's all for this posting.