Thursday, December 10, 2009

Automating Excel copy and paste jobs across multiple sheets

When I first started this project, I found myself with this Excel workbook that had a few sheets in it, each sheet containing an extract layout for data of a given record type. For example, if the record type was "Person" the extract layout might look like this:

FIRST_NAME
LAST_NAME
PHONE_NUMBER

etc.

Those things in capital letters are called fields, or headers. In my example above, there are 3 fields/headers. In reality, I currently have a record type that has 118 fields. The problem, as you'll soon see, is that these headers are shown in a top-to-bottom format, instead of left-to-right. My job is to get these guys showing in a left-to-right format. (That's easy; there's a Paste Transpose feature in Excel that does this for me--but it doesn't delete all the other crap in the spreadsheet that I don't need.)

I work with a program called PDI (Pervasive Data Integrator) that doesn't much care for the way these fields are presented in my handy dandy Excel spreadsheet. Instead, it prefers to have those headers across the top of the very first row, like this:

FIRST_NAME LAST_NAME PHONE_NUMBER

Over time, this small Excel spreadsheet grew from having 3 sheets (record layouts) in the workbook to 13 at the time of this writing.

My goal was to automate the process of moving those fields from a top-to-bottom format to a left-to-right format while simultaneously deleting all the extra junk that appeared to the right of each field in the spreadsheet, like the length of the field, the field's data type, which database table it came from, etc.

So basically my goal was to turn this:

FIRST_NAME CHAR 12 PERSON 5
LAST__NAME CHAR 17 PERSON 5
PHONE_NUMB CHAR 10 PERSON 5

...into this:

FIRST_NAME LAST_NAME PHONE_NUMB

See all that extra junk I need to get rid of? At first, I recorded a macro in Excel, but I found that this was simply not acceptable for a number of reasons. The main reason is that it wasn't dynamic. It always selected a certain number of fields, and did whatever it was supposed to do. So, if I had only 10 fields I wanted to do my wizardry on and I used my original code, it would have selected 150 fields (from A2:A151 in Excel terms) and when I then imported those fields into my PDI map, PDI would recognize all 150 fields instead of just the 10. Making matters worse, PDI doesn't let me simply delete all of those extra fields at once (it's buggy like that) so I can only do groups at a time.

Basically, it's a waste of time.

So, I then took that code and decided to make it generic and adaptable. But then there was another problem. "Great, I can get it to automate the process on one sheet, but what about the other 12?" Imagine if there were a hundred sheets. Scale it to a thousand. You get the idea. Time-consuming to do, unless you automate that process.

How did I do it? I did it by using a loop within a loop. The first loop is a For Loop. It says, "For sheet 1 through the very last sheet, do this stuff."

The second loop is a Do While Loop. This loop says "while the currently selected cell is not empty, do this stuff." All it does is examine the cell it's on to see if it's blank or not. If it's not blank, it moves down to the next cell and evaluates it in turn. It does this until it encounters a blank cell, indicating that there is no more data I need to worry about. The key here is that it keeps track of the cell number that it's on.

So, I now have a variable keeping track of my starting and ending cells (myRange1 and myRange2, respectively). That means I can store those guys into a final variable called finalRange, which will be used to tell Excel which range of cells to select.

Cool, huh?

The code below will look like crap on this blog since it's in such a narrow column, but copy and paste it into your Excel workbook's VBA window and it'll work.

Here's the code:

Sub PrepForPDI()
'
' PrepForPDI Macro
'
' Keyboard Shortcut: Ctrl+m
'

'Set up the variables we'll be using
Dim myCounter, endCounter, mySheetCount, i As Integer
Dim myRange1, myRange2, finalRange

'Step 1. Starting at the first sheet, do the following. Repeat for each one until the very last sheet.
For i = 1 To Sheets.Count
Sheets(i).Select 'Select sheet i (starting with 1 since i = 1 above)
myRange2 = "A2" 'This is our starting range
myCounter = 2 'Counter that keeps track of currently selected cell

'While the selected cell is not blank, increase my counter and update my ending range
Do While Range(myRange2).Text <> ""
myCounter = myCounter + 1
myRange2 = "A" & myCounter 'The result of this variable would be A1, A2, A3, depending on where my counter is at
Loop 'go back up to the Do While statement and repeat until a blank cell is encountered

'Once a blank cell is found as we go down our column, jump out of the loop and execute the following:
myRange1 = "A2" 'The start of our selection range in the Excel spreadsheet
endCounter = myCounter - 1 'Subtract 1 from the counter since the counter was last on a blank cell
myRange2 = "A" & endCounter 'The end of our selection range in the Excel spreadsheet
finalRange = myRange1 & ":" & myRange2 'This would look like A2:A114 for example

'Now that we have our final range, select it, copy it, paste (transpose) it, delete the
'rows below row 1 since we only care about preserving this new row 1, and select cell A1
'so that we're always looking at the very first field on any given sheet.
Range(finalRange).Select
Selection.Copy 'Copy it. :-)
Range("A1").Select
Selection.PasteSpecial Paste:=xlPasteAll, Operation:=xlNone, SkipBlanks:=False, Transpose:=True
Rows("2:150").Select
Application.CutCopyMode = False
Selection.Delete Shift:=xlUp
Range("A1").Select
Next i 'Now i = 2, so when the For statement loops back to Step 1, it will select Sheet number i, or Sheet number 2.
'Select the very first sheet to bring us back to home base.
Sheets(1).Select
End Sub