![]() |
#2
风吹过b2020-08-12 20:35
|
如:查找“abab”
要求:
1、速度快;
2、忽略大小写;
只有本站会员才能查看附件,请 登录
![]() |
#2
风吹过b2020-08-12 20:35
如果文件比较小,一次按 字体读入内存,保存到数组里,然后循环搜索第一个字符,二个值。
需要查找的字符,可以放到二个数组里。第一个是全部小写的,第二个是全部大写的值。 for i=0 to ubound(t) if t(i)=a(0) or t(i)=b(0) then '找到第一个 nf=true for j=1 to ubount(a) if t(i+j-1)<>a(j) and t(i+j-1)<>b(j) then '与后面的不相同, nf =false '设为没找到 exit for endif next j if nf then ' 找到了,处理 .......... end if next i 一般来说,内存搜索内容是很快的,使用 instr字符比较,如果是 IDE 环境,可能instr会更快一点,编译后,就难说谁更快。 大体上就想到这些,自己实践下吧。 |
![]() |
#3
William19492020-08-15 09:47
文件比较大,(2GB以上);
我是用“内存映射方式”将文件分段读取。然后对每一段的内容进行匹配; 关键是:如果采用逐字节的循环搜索,速度都很慢,我之前已经试过很多种算法,效果都不好; 但是,如果用InStrB,则速度很快、很理想;不过InStrB是区分大小写的; 换句话说:要想追求速度,就不能忽略大小写;要想忽略大小写,就不能提高速度;矛盾呀! 我与 WinHex 软件进行对比测试,用一个2.5GB大的文件进行测试。 1、用InStrB(区分大小写)进行搜索, 我的速度:不到3秒; WinHex速度:大概5秒;(用“查找十六进制数值,Ctrl+Alt+X”) 2、用逐字节的循环搜索方式(也用到InStr)进行搜索, 我的速度:2分钟;慢的过分;(这还是我的多种算法中,最快的一种了) WinHex速度:大概6秒;(用“查找文本,Ctrl+F”) 也不知道WinHex是怎么做到的? [此贴子已经被作者于2020-8-15 10:02编辑过] |
![]() |
#4
风吹过b2020-08-15 16:16
花了一个下午,按我的思路写了一个查找,我会把所有的结果都列出来
测试文件大小与时间关系,编译后。IDE环境,219M测试用了6秒多。 219M:1.3秒 973M:6.19秒 我是直接读文件,每次读一块,块大小定义为 4k . 块大小提高到4M时,973M的用时 5.7秒 然后拿了一个超过2G的文件测试,然后我无法解决的问题出现了。 我使用 FOR 循环,用 get读数据,在读数据之前,需要使用 lof 函数取文件字符数进行分块计数 ,然后超过2G文件的字节数对于lof函数的返回类型(long) 来说,太大了,报错,而这个错误,无法解决。 回头再想想用 input 函数试试,看看有啥办法不。。 |
![]() |
#5
风吹过b2020-08-15 16:26
我的思路:
1、打开文件,计算文件大,计算分多少块,最后一块有多少字节 2、读一块。放缓问区后半部分。这步是为与循环体其它数据处理相同 3、循环开始, 缓冲区后半部分移动前半部分, 接着再读一块,把读出来的内容放缓冲区后半部分 4、内循环开始 查找第一个字符,查找范围只使用前半部分数据。每次只比较一个块,为什么要准备两个块的数据,是为了防止查找的字符串在边界上,导致在一个块中查找不完整导致判断查找失败的情况 使用 字符比较,使用 OR 连接,同时比较大小写情况。 查找到了,再嵌循环查找后续字符,同样使用 or 或 and 连接同时比较两种情况。建议使用 or 。or 是有一成立就跳,不会继续比较后面的,and 需要比较所有的部分。 5、循环结束后 缓冲区后半部分移动前半部分, 读剩余部分,不足一个块,放缓冲区后半部分 再次内循环。查找范围 块长度+剩余部分-查找字符串长度 。 |
![]() |
#6
William19492020-08-16 09:34
首先,向风版道一声:辛苦!感谢你花时间帮我解困;
对于,2G以上的文件会超出Long范围,我用的是API读文件内容,而不是 Open xxx For Binary As xx 以下是我写的程序: ![]() Option Explicit Private Type SYSTEM_INFO dwOemID As Long dwPageSize As Long lpMinimumApplicationAddress As Long lpMaximumApplicationAddress As Long dwActiveProcessorMask As Long dwNumberOrfProcessors As Long dwProcessorType As Long dwAllocationGranularity As Long wProcessorLevel As Integer wProcessorRevision As Integer End Type Private Declare Sub GetSystemInfo Lib "kernel32" (lpSystemInfo As SYSTEM_INFO) Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long) Private Declare Sub ZeroMemory Lib "kernel32" Alias "RtlZeroMemory" (lpDst As Any, ByVal Length As Long) Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long Private Declare Function CreateFile Lib "kernel32" Alias "CreateFileA" (ByVal lpFileName As String, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Long, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long Private Const GENERIC_READ = &H80000000 Private Const FILE_SHARE_READ = &H1 Private Const FILE_SHARE_WRITE = &H2 Private Const FILE_SHARE_DELETE = &H4 Private Const OPEN_EXISTING = 3 Private Const FILE_ATTRIBUTE_NORMAL = &H80 Private Const FILE_FLAG_SEQUENTIAL_SCAN = &H8000000 Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, lpFileSizeHigh As Long) As Long '--- 文件映射 Private Declare Function CreateFileMapping Lib "kernel32" Alias "CreateFileMappingA" (ByVal hFile As Long, ByVal lpFileMappigAttributes As Long, ByVal flProtect As Long, ByVal dwMaximumSizeHigh As Long, ByVal dwMaximumSizeLow As Long, ByVal lpName As String) As Long Private Declare Function MapViewOfFile Lib "kernel32" (ByVal hFileMappingObject As Long, ByVal dwDesiredAccess As Long, ByVal dwFileOffsetHigh As Long, ByVal dwFileOffsetLow As Long, ByVal dwNumberOfBytesToMap As Long) As Long Private Declare Function UnmapViewOfFile Lib "kernel32" (ByVal lpBaseAddress As Long) As Long Private Const PAGE_READONLY = &H2 Private Const FILE_MAP_READ = &H4 Private Type LARGE_INTEGER LowPart As Long HighPart As Long End Type Private fhWnd As Long, AllocationGranularity As Long Private mFileSize As Currency Private Buffer() As Byte Private Sub Form_Load() Dim SysInfo As SYSTEM_INFO Call GetSystemInfo(SysInfo) AllocationGranularity = SysInfo.dwAllocationGranularity End Sub Private Sub Form_Unload(Cancel As Integer) Erase Buffer End Sub Private Sub Command1_Click() Dim FileName As String, FindStr As String Dim Le As Long, Pos As Long, P As Long Dim FindPos As Currency, Start As Currency Dim bFind() As Byte Dim TTT As Single TTT = Timer FileName = "C:\xxx\xxx\xxx.yyy" '文件名(包含路径) If Dir(FileName) = "" Then Exit Sub '-------------------------------------------------------------- FindStr = "FFAABBCC" '查找字节 字符串 Le = Len(FindStr) If Le Mod 2 = 1 Then Le = Le + 1 Le = Le \ 2 - 1 ReDim bFind(Le) As Byte Pos = 1 For P = 0 To Le bFind(P) = Val("&H" & Mid(FindStr, Pos, 2)) Pos = Pos + 2 Next '-------------------------------------------------------------- fhWnd = OpenFile(FileName) '打开文件 mFileSize = GetFileSizeAPI(fhWnd) '获得文件大小 Debug.Print "文件大小 = " & FormatNumber(mFileSize, 0, , , vbTrue) & " 字节" ReDim Buffer(AllocationGranularity - 1) As Byte '注意:缓冲区大小必须是 AllocationGranularity,或 AllocationGranularity的整数倍 'FindByte 函数返回查找字节位置,-1表示没有匹配; 'Start 参数:表示查找起始位置,0表示从头开始; Start = 0 FindPos = FindByte(bFind, Start) '查找 Call CloseHandle(fhWnd) '关闭文件 Erase bFind Debug.Print "用时 = " & (Timer - TTT) * 1000 & " 毫秒; " & "查找位置 = " & FindPos End Sub Private Function FindByte(ByteFind() As Byte, ByVal Start As Currency) As Currency Dim fMaphWnd As Long, MapByteSum As Long, FindLen As Long, bStrPtr As Long, Start2 As Long Dim fSize As Currency, Offset As Currency Dim Follow As Boolean Dim bStrand() As Byte FindLen = UBound(ByteFind) ReDim bStrand(FindLen * 2 - 1) As Byte bStrPtr = VarPtr(bStrand(0)) MapByteSum = AllocationGranularity Offset = Int(Start / AllocationGranularity) * AllocationGranularity Start = Start - Offset + 1 If MapByteSum - Start < FindLen Then Start2 = FindLen - (MapByteSum - Start) Else Start2 = 1 fSize = mFileSize - Offset fMaphWnd = OpenFileMapping(fhWnd) Do If MapByteSum > fSize Then MapByteSum = fSize Call ZeroMemory(Buffer(0), AllocationGranularity) End If Call ReadFileMapping(fMaphWnd, Offset, MapByteSum, Buffer) If Follow = True Then Call CopyMemory(bStrand(FindLen), Buffer(0), FindLen) FindByte = InStrB(Start2, bStrand, ByteFind) - 1 If FindByte > -1 Then FindByte = Offset - FindLen + FindByte Exit Do End If Start2 = 1 End If FindByte = InStrB(Start, Buffer, ByteFind) - 1 If FindByte > -1 Then FindByte = Offset + FindByte Exit Do End If If fSize > MapByteSum Then Call CopyMemory(ByVal bStrPtr, Buffer(MapByteSum - FindLen), FindLen) Follow = True End If Offset = Offset + AllocationGranularity fSize = fSize - MapByteSum Start = 1 Loop Until fSize = 0 Call CloseHandle(fMaphWnd) '关闭文件映射 Erase bStrand End Function Private Function OpenFile(ByVal FileName As String) As Long '打开文件 Dim ShareMode As Long ShareMode = FILE_SHARE_READ Or FILE_SHARE_WRITE Or FILE_SHARE_DELETE OpenFile = CreateFile(FileName, GENERIC_READ, ShareMode, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL Or FILE_FLAG_SEQUENTIAL_SCAN, 0) End Function Private Function GetFileSizeAPI(ByVal FilehWnd As Long) As Currency '文件大小;字节 Dim fLo As Long, fHi As Long fLo = GetFileSize(FilehWnd, fHi) GetFileSizeAPI = HighLowToSize(fLo, fHi) End Function Private Function OpenFileMapping(ByVal FilehWnd As Long, Optional ByVal FileSize As Currency = 0) As Long '打开文件映射 Dim fLo As Long, fHi As Long If FileSize > 0 Then Call SizeToHighLow(FileSize, fLo, fHi) OpenFileMapping = CreateFileMapping(FilehWnd, 0, PAGE_READONLY, fHi, fLo, vbNullString) End Function Private Function ReadFileMapping(ByVal MapFilehWnd As Long, ByVal Offset As Currency, ByVal ViewSize As Long, ByRef Buffer() As Byte) As Boolean Dim MapMemPtr As Long, fLo As Long, fHi As Long If Offset > 0 Then Call SizeToHighLow(Offset, fLo, fHi) MapMemPtr = MapViewOfFile(MapFilehWnd, FILE_MAP_READ, fHi, fLo, ViewSize) If MapMemPtr > 0 Then Call CopyMemory(Buffer(0), ByVal MapMemPtr, ViewSize) Call UnmapViewOfFile(MapMemPtr) ReadFileMapping = True End If End Function Private Function HighLowToSize(ByVal LowLong As Long, ByVal HighLong As Long) As Currency Dim LI As LARGE_INTEGER With LI .LowPart = LowLong .HighPart = HighLong End With Call CopyMemory(HighLowToSize, LI, Len(LI)) HighLowToSize = HighLowToSize * 10000 End Function Private Sub SizeToHighLow(ByVal FileSize As Currency, ByRef LowLong As Long, ByRef HighLong As Long) Dim LI As LARGE_INTEGER Call CopyMemory(LI, CCur(FileSize / 10000), Len(LI)) With LI LowLong = .LowPart HighLong = .HighPart End With End Sub 只有本站会员才能查看附件,请 登录 关于上述代码的几点说明: 1、给“FileName”变量指定路径文件名,(可以指定大于2G的文件); 2、给“FindStr”变量指定搜索关键字;注意:格式是字节。如:“FFAABBCC” 3、上述代码仅限于字节方式的查找,也就是说,是区分大小写的;对于文本方式(忽略大小写) ,我写不出来,(或者说,我的写速度极慢) |
![]() |
#7
风吹过b2020-08-16 12:18
有几个问题想问题一个,你这个代码,应该解决了 2.5G 大小文件的查找,当用户要求对 一个 4.1G大小的文件查找时会不会出错?无符号long为4G ,4.1G正好超出。
另外:InStrB 返回值 是 long ,当所在位置超过了 long 范围时,InStrB 的返回值是多少??是怎么样的情况。 你程序里大量使用 Currency 类型的数据,这个是可以超过 long 的范围,但要注意到,VB内置函数,大多数只能返回 long 范围,而不是 Currency 范围,这里是否存在超过 long 范围,而不超出 Currency 范围的情况。 ------------------------ 文本查找,把文本转为 byte数组,然后同样查找。 转换代码只要一行就可以了: s = StrConv("assb", vbFromUnicode) StrConv 可以转换字符串内码,返回的值可以给 byte数组,并且可以自动调整byte数组的大小。 vbFromUnicode:按省缺代码页转为 ANSI 字符串。 vbUnicode:按转为 Unicode 字符串。可以把包含中文的 byte数组,转化为能显示为中文的字符串 ======================= ![]() Option Explicit Private Type SYSTEM_INFO dwOemID As Long dwPageSize As Long lpMinimumApplicationAddress As Long lpMaximumApplicationAddress As Long dwActiveProcessorMask As Long dwNumberOrfProcessors As Long dwProcessorType As Long dwAllocationGranularity As Long wProcessorLevel As Integer wProcessorRevision As Integer End Type Private Declare Sub GetSystemInfo Lib "kernel32" (lpSystemInfo As SYSTEM_INFO) Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long) Private Declare Sub ZeroMemory Lib "kernel32" Alias "RtlZeroMemory" (lpDst As Any, ByVal Length As Long) Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long Private Declare Function CreateFile Lib "kernel32" Alias "CreateFileA" (ByVal lpFileName As String, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Long, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long Private Const GENERIC_READ = &H80000000 Private Const FILE_SHARE_READ = &H1 Private Const FILE_SHARE_WRITE = &H2 Private Const FILE_SHARE_DELETE = &H4 Private Const OPEN_EXISTING = 3 Private Const FILE_ATTRIBUTE_NORMAL = &H80 Private Const FILE_FLAG_SEQUENTIAL_SCAN = &H8000000 Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, lpFileSizeHigh As Long) As Long '--- 文件映射 Private Declare Function CreateFileMapping Lib "kernel32" Alias "CreateFileMappingA" (ByVal hFile As Long, ByVal lpFileMappigAttributes As Long, ByVal flProtect As Long, ByVal dwMaximumSizeHigh As Long, ByVal dwMaximumSizeLow As Long, ByVal lpName As String) As Long Private Declare Function MapViewOfFile Lib "kernel32" (ByVal hFileMappingObject As Long, ByVal dwDesiredAccess As Long, ByVal dwFileOffsetHigh As Long, ByVal dwFileOffsetLow As Long, ByVal dwNumberOfBytesToMap As Long) As Long Private Declare Function UnmapViewOfFile Lib "kernel32" (ByVal lpBaseAddress As Long) As Long Private Const PAGE_READONLY = &H2 Private Const FILE_MAP_READ = &H4 Private Type LARGE_INTEGER LowPart As Long HighPart As Long End Type Private fhWnd As Long, AllocationGranularity As Long Private mFileSize As Currency Private Buffer() As Byte Private Sub Form_Load() Dim SysInfo As SYSTEM_INFO Call GetSystemInfo(SysInfo) AllocationGranularity = SysInfo.dwAllocationGranularity End Sub Private Sub Form_Unload(Cancel As Integer) Erase Buffer End Sub Private Sub Command1_Click() Dim FileName As String, FindStr As String Dim Le As Long, Pos As Long, P As Long Dim FindPos As Currency, Start As Currency Dim bFind() As Byte, bFind2() As Byte '加一个数组 Dim TTT As Single TTT = Timer FileName = "C:\xxx\xxx\xxx.yyy" '文件名(包含路径) If Dir(FileName) = "" Then Exit Sub '-------------------------------------------------------------- FindStr = "FFAABBCC" '查找字节 字符串 Le = Len(FindStr) If Le Mod 2 = 1 Then Le = Le + 1 Le = Le \ 2 - 1 ReDim bFind(Le) As Byte ReDim bFind2(Le) As Byte '加一个数组 Pos = 1 For P = 0 To Le bFind(P) = Val("&H" & Mid(FindStr, Pos, 2)) '----------生成大小写字母转换的第二个数组---------- If bFind(P) > 64 And bFind(P) < 91 Then '大写字母 bFind2(P) = bFind(P) + 32 '转小写字母 ElseIf bFind(P) > 96 And bFind(P) < 123 Then '小写字母 bFind2(P) = bFind(P) - 32 '转大写字母 Else bFind2(P) = bFind(P) '非字母按原字符 End If Pos = Pos + 2 Next '-------------------------------------------------------------- fhWnd = OpenFile(FileName) '打开文件 mFileSize = GetFileSizeAPI(fhWnd) '获得文件大小 Debug.Print "文件大小 = " & FormatNumber(mFileSize, 0, , , vbTrue) & " 字节" ReDim Buffer(AllocationGranularity - 1) As Byte '注意:缓冲区大小必须是 AllocationGranularity,或 AllocationGranularity的整数倍 'FindByte 函数返回查找字节位置,-1表示没有匹配; 'Start 参数:表示查找起始位置,0表示从头开始; Start = 0 FindPos = FindByte(bFind, bFind2, Start) '查找 '多传一个数组进去 Call CloseHandle(fhWnd) '关闭文件 Erase bFind Debug.Print "用时 = " & (Timer - TTT) * 1000 & " 毫秒; " & "查找位置 = " & FindPos End Sub Private Function FindByte(ByteFind() As Byte, ByteFind2() As Byte, ByVal Start As Currency) As Currency '需要多传一个数组进来 Dim fMaphWnd As Long, MapByteSum As Long, FindLen As Long, bStrPtr As Long, Start2 As Long Dim fSize As Currency, Offset As Currency Dim Follow As Boolean Dim bStrand() As Byte FindLen = UBound(ByteFind) ReDim bStrand(FindLen * 2 - 1) As Byte bStrPtr = VarPtr(bStrand(0)) MapByteSum = AllocationGranularity Offset = Int(Start / AllocationGranularity) * AllocationGranularity Start = Start - Offset + 1 If MapByteSum - Start < FindLen Then Start2 = FindLen - (MapByteSum - Start) Else Start2 = 1 fSize = mFileSize - Offset fMaphWnd = OpenFileMapping(fhWnd) Do If MapByteSum > fSize Then MapByteSum = fSize Call ZeroMemory(Buffer(0), AllocationGranularity) End If Call ReadFileMapping(fMaphWnd, Offset, MapByteSum, Buffer) If Follow = True Then Call CopyMemory(bStrand(FindLen), Buffer(0), FindLen) ' FindByte = InStrB(Start2, bStrand, ByteFind) - 1 'instrb改为自定义函数 FindByte = UInStrB(Start2, bStrand, ByteFind, ByteFind2) - 1 If FindByte > -1 Then FindByte = Offset - FindLen + FindByte Exit Do End If Start2 = 1 End If ' FindByte = InStrB(Start, Buffer, ByteFind) - 1 FindByte = UInStrB(Start2, bStrand, ByteFind, ByteFind2) - 1 If FindByte > -1 Then FindByte = Offset + FindByte Exit Do End If If fSize > MapByteSum Then Call CopyMemory(ByVal bStrPtr, Buffer(MapByteSum - FindLen), FindLen) Follow = True End If Offset = Offset + AllocationGranularity fSize = fSize - MapByteSum Start = 1 Loop Until fSize = 0 Call CloseHandle(fMaphWnd) '关闭文件映射 Erase bStrand End Function Private Function UInStrB(ByVal Start2 As Long, ByRef bStrand() As Byte, ByRef ByteFind() As Byte, ByRef ByteFind2() As Byte) As Currency Dim FN As Boolean Dim i As Long '循环变量 Dim bfw1 As Long, bfw2 As Long '二个位置变量 Do bfw1 = InStrB(Start2, bStrand, ByteFind(0)) bfw2 = InStrB(Start2, bStrand, ByteFind2(0)) '-----------取最近的位置------------ '存在几种情况:0,0;>0,0;0,>0;>0,>0。 If bfw1 = 0 And bfw2 = 0 Then '第一种,没找到,退出循环 Exit Do 'ElseIf bfw1 > 0 And bfw2 = 0 Then '第二种不需要处理,这个判断也可以不执行 '第二种不需处理 ElseIf bfw1 = 0 And bfw2 > 0 Then '第三种使用第二个位置 bfw1 = bfw2 ElseIf bfw1 > 0 And bfw2 > 0 Then '第四种,使用最近的位置 If bfw1 > bfw2 Then bfw1 = bfw2 End If FN = True For i = 1 To UBound(ByteFind) If bStrand(bfw1 + i) = ByteFind(i) Or bStrand(bfw1 + i) = ByteFind2(i) Then '如果等于其中一个 Else '与二个均不相等,那么设置为没找到 FN = False End If Next i Start2 = bfw1 + 1 '从新的位置找起 Loop While Not FN 'for 循环结束后,如果找到,那么fn为真值,这时不需要再次循环查找,否则需要继续DO循环 UInStrB = bfw1 'bfw1要么是 fn 为true 得到的结果,要么是没进for 循环的 0 End Function Private Function OpenFile(ByVal FileName As String) As Long '打开文件 Dim ShareMode As Long ShareMode = FILE_SHARE_READ Or FILE_SHARE_WRITE Or FILE_SHARE_DELETE OpenFile = CreateFile(FileName, GENERIC_READ, ShareMode, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL Or FILE_FLAG_SEQUENTIAL_SCAN, 0) End Function Private Function GetFileSizeAPI(ByVal FilehWnd As Long) As Currency '文件大小;字节 Dim fLo As Long, fHi As Long fLo = GetFileSize(FilehWnd, fHi) GetFileSizeAPI = HighLowToSize(fLo, fHi) End Function Private Function OpenFileMapping(ByVal FilehWnd As Long, Optional ByVal FileSize As Currency = 0) As Long '打开文件映射 Dim fLo As Long, fHi As Long If FileSize > 0 Then Call SizeToHighLow(FileSize, fLo, fHi) OpenFileMapping = CreateFileMapping(FilehWnd, 0, PAGE_READONLY, fHi, fLo, vbNullString) End Function Private Function ReadFileMapping(ByVal MapFilehWnd As Long, ByVal Offset As Currency, ByVal ViewSize As Long, ByRef Buffer() As Byte) As Boolean Dim MapMemPtr As Long, fLo As Long, fHi As Long If Offset > 0 Then Call SizeToHighLow(Offset, fLo, fHi) MapMemPtr = MapViewOfFile(MapFilehWnd, FILE_MAP_READ, fHi, fLo, ViewSize) If MapMemPtr > 0 Then Call CopyMemory(Buffer(0), ByVal MapMemPtr, ViewSize) Call UnmapViewOfFile(MapMemPtr) ReadFileMapping = True End If End Function Private Function HighLowToSize(ByVal LowLong As Long, ByVal HighLong As Long) As Currency Dim LI As LARGE_INTEGER With LI .LowPart = LowLong .HighPart = HighLong End With Call CopyMemory(HighLowToSize, LI, Len(LI)) HighLowToSize = HighLowToSize * 10000 End Function Private Sub SizeToHighLow(ByVal FileSize As Currency, ByRef LowLong As Long, ByRef HighLong As Long) Dim LI As LARGE_INTEGER Call CopyMemory(LI, CCur(FileSize / 10000), Len(LI)) With LI LowLong = .LowPart HighLong = .HighPart End With End Sub |
![]() |
#8
William19492020-08-16 16:00
回复:
1、大于 4.1G 的文件不会超出。甚至大于10G也不会超出。因为GetFileSize函数的lpFileSizeHigh参数接收文件大小的高Long是计数值,不是真正意义上的文件大小值(至少我这么理解),例如,当文件小于2^32(4294967296)时,lpFileSizeHigh参数为0;当文件大于(4294967296)时,lpFileSizeHigh参数为1;当文件大于(4294967296 * 2)时,lpFileSizeHigh参数为2;依此类推。 2、关于InStrB 返回值超过Long范围?我没试过,我在6楼的代码是不可能让它超出的,因为每次只对缓冲区内的字节查找;而该缓冲区(Buffer)最大也不会超出Long范围;注:我定义的缓冲区是SysInfo.dwAllocationGranularity(65536) 3、我没有大量地使用 Currency , ![]() A、获得文件大小时使用; B、设置查找起始位置时使用;而且在“FindByte”过程中进行了处理“Start = Start - Offset + 1”,这样使得Start 的值不会太大! C、做为偏移地址(Offset)使用,是为了传递给MapViewOfFile函数,不会伤到VB内置函数的 4、我跑一遍你修改的程序,感觉是有问题的;你只对 bFind2 进行了大小互转;可是缓冲区的内容却没有变。这样有漏掉匹配的情况: 例如:bFind的值是“aBaB”;bFind2的值是“AbAb”;如果缓冲区(Buffer字节数组)里有“ABab”,会漏掉的; 我之前写过这种思路,不过是把缓冲区的所有的在97~122范围的字节都要转大写,这样就很费时了。 |
![]() |
#9
jklqwe1112020-08-16 16:29
看不出楼主是要查找什么。是要查找 "FFAABBCC" 这个字符串吗?
以下这段代码看着糊度,尤其这一句 bFind(P) = Val("&H" & Mid(FindStr, Pos, 2)) 不知道要干啥? FindStr = "FFAABBCC" '查找字节 字符串 Le = Len(FindStr) If Le Mod 2 = 1 Then Le = Le + 1 Le = Le \ 2 - 1 ReDim bFind(Le) As Byt Pos = 1 For P = 0 To Le bFind(P) = Val("&H" & Mid(FindStr, Pos, 2)) Pos = Pos + 2 Next |
![]() |
#10
William19492020-08-16 16:50
啊!好吧,我这样解释:
我那个“FindStr = "FFAABBCC"”,只是举例,好让大家注意格式:其目的是想说,只能输入字节,也就是说,你只能输入00 ~ FF之间的数,而且每两位(两个字符)作为一个字节,"FFAABBCC"表示4个字节,至于你想输入什么,就输入什么,只要格式对了就行。 再举例说:(看1楼的图) 你可以输入 FindStr = "61426142",由你来定,我想我说明白了! 就像使用WinHex 软件,在搜索框中输入的一样; 只有本站会员才能查看附件,请 登录 |
![]() |
#11
风吹过b2020-08-16 17:42
缓冲区中是:ABaB ,对应 查找 aBaB
ByteFind=aBaB ByteFind2=AbAb,我第一次查找时,只传了第一个字节进去, bfw1 = InStrB(Start2, bStrand, ByteFind(0)) 这里查找 a bfw2 = InStrB(Start2, bStrand, ByteFind2(0)) 这里查找 A 然后在循环中,再依次比较后续的字符。每次都是与二种字符相比较,所以不会漏 For i = 1 To UBound(ByteFind) If bStrand(bfw1 + i) = ByteFind(i) Or bStrand(bfw1 + i) = ByteFind2(i) Then '如果等于其中一个 这里使用循环比较 B\b、a\A、B\b 三种情况,所以就不会漏 第一次是 bfw2 找到,然后bfw1为0或为更大值,经判断后,使用 bfw2 然后循环中 B=ByteFind(1) a=ByteFind(2) B=ByteFind(3) 都命中这个判断的 bStrand(bfw1 + i) = ByteFind(i) Or bStrand(bfw1 + i) = ByteFind2(i) 前半部分,因为这个判断是 or ,所以整个表达式都是真,循环继续。 如果命中后半部分,也会得到真的结果。而后半部分对应的是 查找字符串的 大小写情况。 所以,我的代码不会漏掉这种情况。 ------------------- 今天是在另一个电脑上,没有测试环境,所以自己没法测试。 |
![]() |
#12
William19492020-08-17 09:05
不会漏掉????
好吧~ 你说不会漏掉,就不会漏掉吧。 你不实测,我无语了! 我经过实际测试,发现有遗漏现象,而你却说“不会漏掉”,我本是发贴求助的,不想在这事上掰扯不清。 结了。 |
![]() |
#13
jklqwe1112020-08-17 11:20
看楼主的意思,不管给出查找条件的形式如何,最终还是要找字符序列,如果是这样的话,使用字节直接比对,是有问题的,即使区分大小写也一样,这主要是与字符集和字符编码有关,有些情况是不能得到正确结果的,一般字符查找都是使用字符比对,在有一些情况下是能够转化为字节比对,这些都与字符编码有关,不确定这些条件,所有查找操作以及优化都是无用的。
|
![]() |
#14
wmf20142020-08-17 20:21
自己写kmp算法,扫描到字符时全部变成大写字母在大写模式下匹配。vb在ide环境下速度比较慢,但编译后速度会提升很多,应该不比instrb慢多少。
|
![]() |
#15
yogod2020-09-26 09:37
关注,学习,我也好奇winhex的搜索速度的问题,实测内存映射的方式已经接近winhex的搜索速度。
关于大小写的问题,我认为winhex可能是线程的问题,是在搜索16进制字符串字节的时候,根据不同的组合数量,开特定数量的线程搜索同一段映射。 关于遗漏的问题,我觉得在两个块连接处的就会漏掉,解决的办法是MapViewOfFile时重叠512字节,我正在解决这个问题。 [此贴子已经被作者于2020-9-26 11:36编辑过] |