在模拟网页行为中,最常用的就是提交表单了,其次就是获取验证图片数据,再次hook网页中的js代码的实现。
先说具体的应用场景,简单的场景,如填写用户名密码登陆,这里就涉及到获取表单,填写表单数据,提交表单三个过程。
在网页数据中,表单的形式一般都会带有名字,ID,提交的处理网页的url和其拥有的几个IHTMLInputElement元素,具体网页代码如下:
<form name="loginFrm1" id="loginFrm1" method="post" action="https://mlogin.plaync.com/login/signin"><input type="hidden" name="return_url" value="http://lineage.plaync.com/"><fieldset><legend>로그인</legend><div class="selectId"><select name="game_id"><option value="13">통합계정</option><option value="31">게임계정</option></select></div><div class="login_input"><input type="text" id="id" name="login_name" maxlength="64" size="12" class="user_id" autocomplete="off" title="아이디 또는 이메일 입력"><input type="password" id="pwd" name="password" class="user_pw" maxlength="16" size="12" autocomplete="off" title="비밀번호 입력"><input type="button" name="login" value="로그인" class="submit"></div></fieldset><ul class="member"><li class="join"><a href="http://go.plaync.co.kr/Account/Join">회원가입</a></li><li class="find"><a href="http://go.plaync.co.kr/Account/SearchAccount">계정</a>/<a href="http://go.plaync.co.kr/Account/SearchPassword">비밀번호찾기</a></li></ul><div class="custom"><a class="new" href="http://lineage.plaync.com/service/freepay/index"><i></i>처음이신가요?</a><a class="comming" href="http://lineage.plaync.com/service/returnbrave/index"><i></i>오랜만이신가요?</a></div></form>
可以看到,表单名字为loginFrm1,ID为loginFrm1,提交的url为https://mlogin.plaync.com/login/signin,拥有的几个元素login_name, password,login。
首先获取表单,基本原理也是找到DOC,然后通过get_forms找到表单的集合,然后遍历表单找到名字为loginfrm1的表单对象IHTMLFormElement将之返回。c++代码如下:
CComQIPtr< IHTMLFormElement > CWebLoginDlg::GetFormByName( std::wstring name )
{
IDispatch *pDisp = NULL;
CComQIPtr< IHTMLFormElement > ret = pDisp;
CComPtr<IHTMLDocument2> pIHTMLDocument2;
GetDHtmlDocument(&pIHTMLDocument2);
if (pIHTMLDocument2 == NULL)
{
return ret;
}
HRESULT hr;
CComBSTR bstrTitle;
pIHTMLDocument2->get_title( &bstrTitle );
CComQIPtr< IHTMLElementCollection > spElementCollection;
hr = pIHTMLDocument2->get_forms( &spElementCollection ); //取得表单集合
if ( FAILED( hr ) )
{
ATLTRACE("获取表单的集合 IHTMLElementCollection 错误");
}
long nFormCount=0;
hr = spElementCollection->get_length( &nFormCount );
if ( FAILED( hr ) )
{
ATLTRACE("获取表单数目错误");
}
for(long i=0; i<nFormCount; i++)
{
pDisp = NULL;
hr = spElementCollection->item( CComVariant( i ), CComVariant(), &pDisp );
if ( FAILED( hr ) )
{
continue;
}
CComQIPtr< IHTMLFormElement > spFormElement = pDisp;
pDisp->Release();
long nElemCount=0;
hr = spFormElement->get_length( &nElemCount );
if ( FAILED( hr ) )
{
continue;
}
CComBSTR formName;
hr = spFormElement->get_name(&formName);
if (FAILED(hr))
{
continue;
}
LPCTSTR lpName = OLE2CT(formName);
if (std::wstring(lpName) == name)
{
ret = spFormElement;
}
}
return ret;
}
其次填写表单,基本原理是通过get_length获取表单域个数,然后遍历表单域元素,通过GetpropertyByName找到需要填写的元素,然后用PutPropertyByName填写表单值,具体c++实现如下:
void CWebLoginDlg::InputFormElement( CComQIPtr< IHTMLFormElement > spFormElement, std::map<std::wstring, std::wstring> formDataList )
{
if (spFormElement != NULL)
{
long nElemCount=0;
HRESULT hr = spFormElement->get_length( &nElemCount );
for(long j=0; j<nElemCount; j++)
{
CComDispatchDriver spInputElement; //取得第 j 项表单域
hr = spFormElement->item( CComVariant( j ), CComVariant(), &spInputElement );
if ( FAILED( hr ) ) continue;
CComVariant vName,vVal,vType; //取得表单域的 名,值,类型
hr = spInputElement.GetPropertyByName( L"name", &vName );
if( FAILED( hr ) ) continue;
hr = spInputElement.GetPropertyByName( L"value", &vVal );
if( FAILED( hr ) ) continue;
hr = spInputElement.GetPropertyByName( L"type", &vType );
if( FAILED( hr ) ) continue;
LPCTSTR lpName = vName.bstrVal? OLE2CT( vName.bstrVal ) : _T("NULL");
LPCTSTR lpVal = vVal.bstrVal? OLE2CT( vVal.bstrVal ) : _T("NULL");
LPCTSTR lpType = vType.bstrVal? OLE2CT( vType.bstrVal ) : _T("NULL");
ATLTRACE(L"old name:%s, lpVal:%s lpType:%s", lpName, lpVal, lpType);
if (formDataList.find(lpName) != formDataList.end())
{
CComVariant vContent(formDataList[lpName].c_str());
spInputElement.PutPropertyByName(L"value", &vContent);
}
hr = spInputElement.GetPropertyByName( L"name", &vName );
if( FAILED( hr ) ) continue;
hr = spInputElement.GetPropertyByName( L"value", &vVal );
if( FAILED( hr ) ) continue;
hr = spInputElement.GetPropertyByName( L"type", &vType );
if( FAILED( hr ) ) continue;
lpName = vName.bstrVal? OLE2CT( vName.bstrVal ) : _T("NULL");
lpVal = vVal.bstrVal? OLE2CT( vVal.bstrVal ) : _T("NULL");
lpType = vType.bstrVal? OLE2CT( vType.bstrVal ) : _T("NULL");
ATLTRACE(L"new name:%s, lpVal:%s lpType:%s", lpName, lpVal, lpType);
}
}
}
最后提交表单,这个很简单,首先找到提交按钮对象,然后调用对象接口click,即可实现提交表单,具体c++代码实现如下:
CComQIPtr< IHTMLElement > pButton = GetElementByClassName(L"login");
if (pButton != NULL)
{
ATLTRACE("dologin");
pButton->click();
}