怎样将“VFP论坛”上发表的帖子主题下载下来 - VFP论坛

王咸美

等　级：新手上路
帖　子：893
专家分：3
注　册：2018-1-4
结帖率：97.52%

楼主

已结贴√ 问题点数：20 回复次数：38

怎样将“VFP论坛”上发表的帖子主题下载下来

怎样将“VFP论坛”上发表的帖子主题下载下来，并写入表文件vfplt.dbf中，vfplt.dbf中字段有：序号 N（8），主题 C（60），发表 C（10），回复 N（6），人气 N（8），最后更新 C（20）
网页地址：https://bbs.bc-cn.net/forum-22-1.html
请各位高手赐教，万分感谢！！！（纯属个人爱好，不喜勿喷！全当路过）

图片附件: 游客没有浏览图片的权限，请登录或注册

搜索更多相关主题的帖子: VFP　帖子　地址　 dbf　主题　

2025-11-13 16:43

王咸美

等　级：新手上路
帖　子：893
专家分：3
注　册：2018-1-4

第 3 楼

得分:0

输入贴子的数量，从1开始往上递增。

2025-11-13 17:40

sam_jiang

等　级：贵宾
威　望：14
帖　子：1065
专家分：1637
注　册：2021-10-13

第 5 楼

得分:0

回复 2楼吹水佬

跟上次抓新浪网抓新闻标题一样,代码改改就能用

2025-11-14 12:11

王咸美

等　级：新手上路
帖　子：893
专家分：3
注　册：2018-1-4

第 7 楼

得分:0

谢谢！能分享具体代码吗？

2025-11-14 16:09

吹水佬

等　级：版主
威　望：451
帖　子：10919
专家分：43569
注　册：2014-5-20

第 9 楼

得分:0

示例只取两页作为参考
因大量处理字符串，用解析器取数据还是觉得不快，用vfp字符串函数处理就更慢。
所以用指针来试试，可能有风险，不能照抄。

程序代码：

DECLARE long URLDownloadToFileA  IN urlmon  long,string,string,long,long
DECLARE long DeleteUrlCacheEntry IN wininet string

DECLARE long malloc IN msvcrt long
DECLARE long free   IN msvcrt as _free long
DECLARE long strcpy IN msvcrt long,string
DECLARE long strstr IN msvcrt long,string

CREATE TABLE vfplt (序号 N(8), 主题 C(240), 发表 C(10), 回复 C(6), 人气 C(8), 最后更新 C(20))
n序号 = 0
url = "https://bbs.bc-/forum-22-1.html"
getPageData(getHtml(url))
url = "https://bbs.bc-/forum-22-2.html"
getPageData(getHtml(url))
SELECT * FROM vfplt

CLEAR ALL 
RETURN

FUNCTION getPageData(cHtml)
    LOCAL pHtml, p
    pHtml = malloc(LEN(cHtml)+1)
    strcpy(pHtml, cHtml)
    p = pHtml
    DO WHILE p > 0
        getTextByTagName(@p, [<td class="title">], [</td>])
        getTextByTagName(@p, [<a], [</a>])
        c主题 = getTextByTagName(@p, [>], [</a>])
        getTextByTagName(@p, [<td class="l_au">], [</td>])
        getTextByTagName(@p, [<a], [</a>])
        c发表 = getTextByTagName(@p, [>],  [</a>])
        c回复 = getTextByTagName(@p, [<td class="l_re">], [</td>])
        c人气 = getTextByTagName(@p, [<td class="l_re">], [</td>])
        getTextByTagName(@p, [<td class="l_last">], [</td>])
        getTextByTagName(@p, [<a], [</a>])
        c最后更新 = getTextByTagName(@p, [>],  [</a>])
        getTextByTagName(@p, [<a], [</a>])
        c最后更新 = c最后更新 + " " + getTextByTagName(@p, [>],  [</a>])
        n序号 = n序号 + 1
        IF !EMPTY(c主题)
            INSERT INTO vfplt VALUES (n序号, c主题, c发表, c回复, c人气, c最后更新)
        ENDIF 
    ENDDO 
    _free(pHtml)
ENDFUNC

FUNCTION getTextByTagName(p, cBeginTagName, cEndTagName)
    IF p == 0
        RETURN ""
    ENDIF 
    p = strstr(p, cBeginTagName)
    IF p == 0
        RETURN ""
    ENDIF
    p = p + LEN(cBeginTagName)
    LOCAL p2
    p2 = strstr(p, cEndTagName)
    IF p2 == 0
        RETURN ""
    ENDIF
    RETURN SYS(2600, p, p2-p)
ENDFUNC

FUNCTION getHtml(url)
    LOCAL tmpHtml
    tmpHtml = cDefPath + "bccn_vfp.html"
    DeleteUrlCacheEntry(url)
    IF URLDownloadToFileA(0, url, tmpHtml, 0, 0)==0
        RETURN FILETOSTR(tmpHtml)
    ENDIF
    RETURN ""
ENDFUNC

2025-11-14 17:25