目前,我正在适应新的工作,其中最让我与同事共享的工作是通过MS Excel中。 我使用的数据透视表频繁,因此需要“堆叠”的数据,在精确的输出melt()
函数中reshape
(reshape2)封装中的R,我已经开始依赖于这一点。
任何人都可以让我开始对VBA宏来做到这一点,还是一个已经存在?
宏观的轮廓是:
- 在Excel工作簿中选择的单元格区域。
- 开始“融化”宏。
- 宏将创建一个提示,“请输入ID列数”,在这里你可以输入前面的识别信息的列数。 (对于为R代码下面是4)。
- 创建名为“融化”,将栈中的数据Excel文件一个新的工作表,并创建名为从原来的选择“变量”等于数据列标题的新列。
换句话说,输出将看起来完全一样简单地执行中的R这两行的输出:
require(reshape)
melt(your.unstacked.dataframe, id.vars = 1:4)
下面是一个例子:
# unstacked data
> df1
Year Month Country Sport No_wins No_losses High_score Total_games
2 2010 5 USA Soccer 4 3 5 9
3 2010 6 USA Soccer 5 3 4 8
4 2010 5 CAN Soccer 2 9 7 11
5 2010 6 CAN Soccer 4 8 4 13
6 2009 5 USA Soccer 8 1 4 9
7 2009 6 USA Soccer 0 0 3 2
8 2009 5 CAN Soccer 2 0 6 3
9 2009 6 CAN Soccer 3 0 8 3
# stacking the data
> require(reshape)
> melt(df1, id.vars=1:4)
Year Month Country Sport variable value
1 2010 5 USA Soccer No_wins 4
2 2010 6 USA Soccer No_wins 5
3 2010 5 CAN Soccer No_wins 2
4 2010 6 CAN Soccer No_wins 4
5 2009 5 USA Soccer No_wins 8
6 2009 6 USA Soccer No_wins 0
7 2009 5 CAN Soccer No_wins 2
8 2009 6 CAN Soccer No_wins 3
9 2010 5 USA Soccer No_losses 3
10 2010 6 USA Soccer No_losses 3
11 2010 5 CAN Soccer No_losses 9
12 2010 6 CAN Soccer No_losses 8
13 2009 5 USA Soccer No_losses 1
14 2009 6 USA Soccer No_losses 0
15 2009 5 CAN Soccer No_losses 0
16 2009 6 CAN Soccer No_losses 0
17 2010 5 USA Soccer High_score 5
18 2010 6 USA Soccer High_score 4
19 2010 5 CAN Soccer High_score 7
20 2010 6 CAN Soccer High_score 4
21 2009 5 USA Soccer High_score 4
22 2009 6 USA Soccer High_score 3
23 2009 5 CAN Soccer High_score 6
24 2009 6 CAN Soccer High_score 8
25 2010 5 USA Soccer Total_games 9
26 2010 6 USA Soccer Total_games 8
27 2010 5 CAN Soccer Total_games 11
28 2010 6 CAN Soccer Total_games 13
29 2009 5 USA Soccer Total_games 9
30 2009 6 USA Soccer Total_games 2
31 2009 5 CAN Soccer Total_games 3
32 2009 6 CAN Soccer Total_games 3
我有两个职位,可用的代码和可下载的工作簿,对我的博客在Excel / VBA这样做:
http://yoursumbuddy.com/data-normalizer
http://yoursumbuddy.com/data-normalizer-the-sql/
下面的代码:
'Arguments
'List: The range to be normalized.
'RepeatingColsCount: The number of columns, starting with the leftmost,
' whose headings remain the same.
'NormalizedColHeader: The column header for the rolled-up category.
'DataColHeader: The column header for the normalized data.
'NewWorkbook: Put the sheet with the data in a new workbook?
'
'NOTE: The data must be in a contiguous range and the
'rows that will be repeated must be to the left,
'with the rows to be normalized to the right.
Sub NormalizeList(List As Excel.Range, RepeatingColsCount As Long, _
NormalizedColHeader As String, DataColHeader As String, _
Optional NewWorkbook As Boolean = False)
Dim FirstNormalizingCol As Long, NormalizingColsCount As Long
Dim ColsToRepeat As Excel.Range, ColsToNormalize As Excel.Range
Dim NormalizedRowsCount As Long
Dim RepeatingList() As String
Dim NormalizedList() As Variant
Dim ListIndex As Long, i As Long, j As Long
Dim wbSource As Excel.Workbook, wbTarget As Excel.Workbook
Dim wsTarget As Excel.Worksheet
With List
'If the normalized list won't fit, you must quit.
If .Rows.Count * (.Columns.Count - RepeatingColsCount) > .Parent.Rows.Count Then
MsgBox "The normalized list will be too many rows.", _
vbExclamation + vbOKOnly, "Sorry"
Exit Sub
End If
'You have the range to be normalized and the count of leftmost rows to be repeated.
'This section uses those arguments to set the two ranges to parse
'and the two corresponding arrays to fill
FirstNormalizingCol = RepeatingColsCount + 1
NormalizingColsCount = .Columns.Count - RepeatingColsCount
Set ColsToRepeat = .Cells(1).Resize(.Rows.Count, RepeatingColsCount)
Set ColsToNormalize = .Cells(1, FirstNormalizingCol).Resize(.Rows.Count, NormalizingColsCount)
NormalizedRowsCount = ColsToNormalize.Columns.Count * .Rows.Count
ReDim RepeatingList(1 To NormalizedRowsCount, 1 To RepeatingColsCount)
ReDim NormalizedList(1 To NormalizedRowsCount, 1 To 2)
End With
'Fill in every i elements of the repeating array with the repeating row labels.
For i = 1 To NormalizedRowsCount Step NormalizingColsCount
ListIndex = ListIndex + 1
For j = 1 To RepeatingColsCount
RepeatingList(i, j) = List.Cells(ListIndex, j).Value2
Next j
Next i
'We stepped over most rows above, so fill in other repeating array elements.
For i = 1 To NormalizedRowsCount
For j = 1 To RepeatingColsCount
If RepeatingList(i, j) = "" Then
RepeatingList(i, j) = RepeatingList(i - 1, j)
End If
Next j
Next i
'Fill in each element of the first dimension of the normalizing array
'with the former column header (which is now another row label) and the data.
With ColsToNormalize
For i = 1 To .Rows.Count
For j = 1 To .Columns.Count
NormalizedList(((i - 1) * NormalizingColsCount) + j, 1) = .Cells(1, j)
NormalizedList(((i - 1) * NormalizingColsCount) + j, 2) = .Cells(i, j)
Next j
Next i
End With
'Put the normal data in the same workbook, or a new one.
If NewWorkbook Then
Set wbTarget = Workbooks.Add
Set wsTarget = wbTarget.Worksheets(1)
Else
Set wbSource = List.Parent.Parent
With wbSource.Worksheets
Set wsTarget = .Add(after:=.Item(.Count))
End With
End If
With wsTarget
'Put the data from the two arrays in the new worksheet.
.Range("A1").Resize(NormalizedRowsCount, RepeatingColsCount) = RepeatingList
.Cells(1, FirstNormalizingCol).Resize(NormalizedRowsCount, 2) = NormalizedList
'At this point there will be repeated header rows, so delete all but one.
.Range("1:" & NormalizingColsCount - 1).EntireRow.Delete
'Add the headers for the new label column and the data column.
.Cells(1, FirstNormalizingCol).Value = NormalizedColHeader
.Cells(1, FirstNormalizingCol + 1).Value = DataColHeader
End With
End Sub
你会这样称呼它:
Sub TestIt()
NormalizeList ActiveSheet.UsedRange, 4, "Variable", "Value", False
End Sub
微软最近推出了功率查询,一个Excel加载在其中增加了很多有趣的功能和性能,以数据操作在Excel中,包括你在找什么。
内外接的实际功能被称为“逆透视列”,这是解释在这篇文章中 。 下面是它的要点:
- 下载并安装加载项
- 打开你的Excel / CSV文件
- 选择表/范围要融化/重塑
- 在“电源查询”选项卡,点击“从表”,这将打开“查询编辑器”
- 选择您要融化/重塑栏(Ctrl或Shift选择,不拖)
- 在“逆透视列”的“转换”选项卡中单击(你也可以返回到Excel之前申请在这里等变换)
- 在“主页”选项卡中单击“关闭并加载”。 这将在Excel中创建一个新表/查询对象与期望的结果。
为寻找一个可视化的方式来规范Excel数据,看到这个视频教程:
http://www.youtube.com/watch?v=xmqTN0X-AgY
首先创建一个用户窗体和两个RefEdit控件的字段命名为Unpivot_Form - rng_id和value_id和提交/ Go按钮。 我还的R用户,并rng_id是包含ID而value_id包含值的范围内; 两个范围包括标头的。
做两个宏:
Sub unpivot()
Unpivot_Form.Show
End Sub
另一个宏是该领域的提交/去按钮内:
Private Sub submit_Click()
'Code to unpivot (convert wide to long for excel)
Dim rng_id, rng_id_header, val_id As Range
Dim colvar, emptyrow, col As Integer
Dim new_sheet As Worksheet
'Put val_id range into a range object
Set val_id = Range(value_id.Value)
'Determine the parameter for the value id range
'This is used for the looping later on
numrows = val_id.Rows.Count
numcols = val_id.Columns.Count
'Resize changes the "block" to the size defined by the row and column
'Offset moves the "block"
Set rng_id_header = Range(range_id.Value).Resize(1)
Set rng_id = Range(range_id.Value).Offset(1, 0).Resize(numrows - 1)
Set new_sheet = Worksheets.Add
'Set up the first column and first batch of id vars
new_sheet.Activate
Range("A65535").End(xlUp).Activate
rng_id_header.Copy ActiveCell
colvar = Range("XFD1").End(xlToLeft).Column + 1
Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Variable"
Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Value"
'Start populating the value ids
For col = 1 To numcols
'populate var_id
'determine last row
emptyrow = Range("A65535").End(xlUp).Row + 1
'no need to activate to source to copy
rng_id.Copy new_sheet.Cells(emptyrow, 1)
'copy the variable
val_id.Offset(, col - 1).Resize(1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar), Cells(emptyrow + numrows - 2, colvar))
'copy the value
val_id.Offset(1, col - 1).Resize(numrows - 1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar + 1), Cells(emptyrow + numrows - 2, colvar + 1))
Next
Unload Me
End Sub
请享用!
或使用:
Sub M_snb_000()
With sheet1.Cells(1).CurrentRegion
sn = .Resize(, .Columns.Count + 1)
End With
For j = 4 To UBound(sn, 2) - 1
With Sheet2.Cells(2 + (UBound(sn) - 1) * (j - 4), 1)
.Resize(UBound(sn) - 1, 5) = Application.Index(sn, Evaluate("row(2:"
& UBound(sn) & ")"), Array(1, 2, 3,UBound(sn, 2), j))
.Resize(UBound(sn) - 1, 1).Offset(, 3) = sn(1, j)
End With
Next
End Sub